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Chapter  1 


Introduction 


In  the  past  few  decades,  the  computer  industry  has  been  revolutionized  by  a  number  of  new 
philosophies  and  techniques.  Some  of  them  have  been  successful  and  some  of  them  are  still  in 
the  stage  of  their  developments.  Software  engineering  is  one  of  the  later. 

Software  engineering  is  the  application  of  a  disciplined,  systematic,  quantifiable  approach  to 
the  development,  operation,  and  maintenance  of  software  [STAND88].  The  purpose  of  software 
engineering  is  to  improve  not  only  the  productivity  but  also  the  reliability,  maintainability, 
and  the  controllability  of  the  software  and  software  design  process. 

In  order  to  achieve  this  goal,  various  kinds  of  methodologies  and  principles  in  software  design 
have  been  proposed  since  1960s,  such  as  structured  programming  design,  top-down  design  and 
bottom-up  design.  Different  methodologies  use  different  representations:  hierarchy  diagrams, 
flowcharts,  structure  charts,  or  data  flow  diagrams.  These  tools  were  developed  from  different 
perspectives  of  software  design  to  catch  different  characteristics  in  software  design.  For  exam- 
ple, hierarchy  diagrams  are  used  to  describe  the  control  structure  of  the  software  system  while 
data  flow  diagrams  are  used  to  show  how  the  data  flows  among  the  modules  in  the  system 
without  concerning  the  issue  of  control. 

One  very  important  problem  of  software  engineering  technologies  is  that  they  are  usually  qual- 
itative [DEMAR82].  The  saying  "You  can  not  control  what  you  can  not  measure"  [DEMAR82] 
still   laughs   at   us   like   a  ghost.   The   consequence   is   that   we  can   not   utilize   the  software 


engineering  techniques  as  reliable  means  in  software  production  management  without  solving 
this  problem.  This  is  why  in  recent  decade,  people  started  to  put  effort  in  software  measures 
research.  They  tried  to  provide  a  sound  mathematical  basis  for  quantifying  software  engineer- 
ing tools  so  that  in  the  long  run,  software  qualities  will  be  able  to  be  judged  according  to  the 
quantitative  evaluation  of  their  designs.  McCabe's  and  Holstead's  measures  are  examples  of 
quantifying  software  designs  such  as  flowcharts  or  even  code.  Henry  Kafura's  measure  is 
designed  to  evaluate  the  complexity  of  data  processing  in  the  software  based  on  data  flow 
diagram.  Since  they  are  calculated  based  on  different  representations  of  software  design,  they 
do  reflect  different  characteristics  of  the  software  system. 

Compared  to  other  engineering  disciplines,  software  engineering  is  still  in  the  need  of  building 
sound  foundation  of  measurement  as  a  basis  of  real  scientific  discipline  of  its  own.  Software 
engineers  demand  good  mathematical  formalism  in  system  design.  This  is  one  of  the  motiva- 
tions of  the  thesis. 

1.1.   The  Goals  of  the  Thesis 

The  purpose  of  this  research  is  to  quantify  the  expertise  of  evaluating  DFDs  and  provide  DFD 
designers  useful  guidance  based  on  the  classified  evaluation  of  DFDs. 

The  goals  of  this  research  are  clearly  stated  in  figure  1.1.  Figure  1.1  is  the  scheme  of  this 
research  and  several  major  tasks  are  involved  in  the  scheme.  First  task  is  to  linearize  the  two 
dimensional  DFDs  by  introducing  another  representation  of  DFDs  -  TR  which  will  provide  a 
good  basis  for  the  later  tasks.  The  second  task  is  to  develop  valid  measures  for  DFDs  to  reflect 
their  design  qualities  so  that  DFDs  can  be  judged  in  a  proper  way.  The  third  task  is  to  classify 
the  categories  of  evaluation  criteria  based  on  the  valid  measures  and  the  poll  result  from  a  sur- 
vey conducted  in  this  research.  The  theoretical  background  of 
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this  task  is  fuzzy  set  theory  and  the  methodology  is  to  build  fuzzy  membership  functions  by 
using  linguistic  approach.  Finally,  the  fuzzy  classifications  of  DFDs'  evaluation  are  used  to 
guide  the  DFD  design. 

The  theoretical  basis  is  built  upon  both  conventional  mathematics  and  fuzzy  set  theory.  Fuzzy 
membership  functions  are  set  up  to  classify  measures  to  reflect  the  evaluations  of  DFDs  from 
experts  in  a  survey.  The  reason  we  adopt  fuzzy  set  theory  is  because  we  want  to  provide  a 
more  intuitive  evaluation  opinion,  such  as  'good',  'fairly  complex',  which  are  closer  to  human 
saying  and  give  people  better  feeling  about  the  quality  of  the  software  design.  Especially,  many 
measures  now  days  don't  consider  the  unit  of  the  measure,  therefore  the  claim  "the  measure  is 
570"  usually  will  not  provide  much  information  to  users.  By  using  fuzzy  membership  functions, 
we  can  group  measures  into  several  levels  over  their  ranges  so  that  a  relative  level  of  a  specific 
DFD  can  always  be  obtained.  In  this  work,  the  membership  function  will  be  derived  based  on 
the  empirical  evaluation  by  the  experts. 

In  order  to  develop  a  intuitive  way  to  derive  measures,  we,  first  of  all,  linearize  DFDs,  i.e.,  we 
map  a  DFD  into  a  linear,  textual  representation.  This  representation  must  keep  all  the 
characteristics  of  the  original  DFD.  Then  a  number  of  measures  are  developed.  The  methodol- 
ogy guarantees  that  these  measures  can  be  obtained  easily  and  directly  from  the  textual 
representations. 

1.2.   Terminology 

Researchers  of  software  engineering  have  been  using  terms  such  as  complexity,  interconnec- 
tions, width  of  the  graph,  to  refer  to  different  things.  Therefore,  it  may  be  beneficial  to  define 
the  meanings  of  some  terms  used  in  this  thesis  before  we  continue.  Some  terms  are  defined 
according  to  "Software  Engineering  Standars  [STAND87]  and  some  of  them  according  to  the 
definitions  in  this  research. 


Complexity    : 

Complexity  is  the  degree  to  which  a  system  or  component  has  a  design  or  implementa- 
tion that  is  complicated  or  difficult  to  understand.  In  this  research,  we  will  emphasize 
structural  complexity  issues  of  DFDs  —  the  complexity  that  arises  from  a  software  docu- 
ment itself. 

Modularity    : 

Modularity  is  the  extent  to  which  software  is  composed  of  discrete  components  such  that 
a  change  to  one  component  has  minimal  impact  on  other  components. 

Cohesion    : 

Cohesion  is  the  degree  to  which  the  tasks  performed  by  a  single  program  module  are 
functionally  related. 

Interconnection    : 

The  connections  among  two  or  more  components  for  the  purpose  of  passing  information 
from  one  to  the  other. 

Token    : 

A  token  is  a  data  item  that  need  not  be  subdivided  within  a  module  when  it  is  passed  to 
this  module  to  be  processed. 

Path    : 

A  path  in  a  DFD  is  a  unique  sequence  of  data/module  names  that  goes  from  an  external 
input  of  the  DFD  to  an  external  output  of  the  DFD.  We  can  also  refer  it  as  Li-path, 
meaning  linear  independent  path. 

Width  of  the  data  usage    : 

The  width  of  a  specific  independent  data's  usage  is  the  number  of  occurrences  of  this 
data  in  the  DFD. 


Burden    : 

Burden  means  the  degree  of  loading  to  a  certain  structure  that  can  be  a  module  in  a 
DFD  or  even  the  whole  DFD  diagram.  Loading  can  refer  to  different  things  such  as  token 
or  path.  For  example,  token  burden  of  a  module  means  the  number  of  tokens  this  module 
process,  including  both  input  and  output  tokens.  Path  burden  of  a  module  can  be  the 
number  of  paths  that  go  through  this  specific  module.  Interconnection  burden  of  a  whole 
DFD  diagram  stands  for  the  number  of  interconnections  this  graph  produces. 

There  are  some  other  terms  used  in  the  research  and  we  will  give  their  definitions  in  the 
chapters  they  appear  due  to  the  need  of  using  real  graph  examples  to  explain  them. 

1.3.   Hypothesis 

Considering  the  natures  of  structural  measures  of  software  tools,  we  adopted  following 
hypothesis  throughout  this  research.  The  beliefs  behind  these  hypothesis  is  stated  clearly  in 
report  "A  mathematical  Perspective  for  Software  Measures  Research"  written  by  Austin  C. 
Melton,  Albert  L.  Baker,  James  M.  Bieman,  and  David  A.  Gustafson  [AUSTI88]. 

Suppose  we  have  a  set  of  similar  software  documents  D,  and  let  M  be  a  software  document 
measure  defined  on  D  and  C  be  a  quantifiable  criterion  that  is  an  intuitive  feeling  of  a  specific 
nature  of  the  software  documents  in  D.  Here,  M  is  actually  a  quantitative  predict  of  C 
[AUSTI88].  Now,  we  restate  the  two  assumptions  as  below  with  slight  change  in  the  first  one  ( 
compare  to  the  correspondent  assumption  in  the  report  )  : 

Assumption  1    : 

There  is  an  order  on  the  documents  in  D;  the  order  is  based  on  the  relative  (  complexity 
)  degree  of  the  measure. 

Assumption  2    : 

A  valid  measure  M  preserves  or  carries  the  "correct  order"  on  D  into  R,  here  R  is  real 


number  range.  That  is,  if  two  documents  dl  and  d2  are  related  by  the  "correct  order", 
then  the  images  M(dl)  and  M(d2)  are  related  in  the  same  relative  order  in  R. 

Here,  the  "correct  order"  is  determined  by  the  criterion  that  is  used  to  reflect  a  specific  nature 
of  the  software  document,  e.g.,  complexity  or  cohesion. 

1.4.   Contents  of  The  Thesis 

The  contents  of  the  thesis  layout  in  the  following  sequences.  The  first  chapter  of  the  thesis  ( 
this  chapter  )  introduces  the  background  of  the  research,  some  of  the  terminologies  used  in  the 
thesis,  hypothesis  of  the  research,  and  the  contents  of  the  thesis. 

Chapter  two  concerns  with  the  development  of  linear  representation  of  DFDs,  covering  the  dis- 
cussion of  Adler's  approach  of  decomposing  DFDs,  the  necessity  of  developing  new  representa- 
tion of  DFDs,  and,  most  importantly,  the  constructive  rules  for  our  linear  representations  of 
DFDs  through  examples  and  its  strike  characteristics. 

Based  on  the  linear  representation  developed  in  chapter  two,  chapter  three  will  furthermore 
build  structural  measures  of  DFDs  under  the  mathematical  hypothesis  stated  in  introduction 
part  of  this  thesis.  Examples  are  also  given  to  illustrate  the  calculation  and  the  usage  of  the 
measures. 

Chapter  four  will  deal  with  the  survey  that  we  have  done  in  this  research,  explaining  the  sur- 
vey environment  and  the  data  obtained  from  the  survey,  introducing  the  theoretical  back- 
ground and  mathematical  techniques  used  in  data  analysis,  and  summarize  the  conclusions  of 
the  analysis. 

The  fifth  chapter  talks  about  how  to  use  fuzzy  set  theory  to  set  up  fuzzy  membership  functions 
for  the  DFDs'  evaluation  expertise  that  should  (  ideally  )  match  the  certain  categories  of  the 
measurements  of  the  linear  representations  of  DFDs.  In  the  meantime,  the  basic  concepts  of 
fuzzy  set  theory  and  the  principles  of  constructing  membership  functions  are  given. 


The  last  chapter,  chapter  6  will  give  the  overall  conclusions  of  this  research  and  what  the 
author  think  about  the  possible  future  works  in  this  area. 


Chapter  2 
Textual  Representation  of  DFDs 


The  data  flow  diagram  is  a  very  important  tool  in  software  design.  It  gives  the  system  imple- 
mentors  a  dynamic  overview  of  the  data  flow  in  the  system.  The  decomposition  of  DFDs  is  a 
top-down  method  that  takes  a  process  as  well  as  its  inputs  and  outputs,  and  logically  describes 
the  process  as  a  set  of  smaller  processes.  But  even  today,  the  decomposition  is  still  performed 
in  the  manner  in  which  analysts  need  to  apply  heuristics  and  expertise  to  the  problem  itself 
[ADLER88].  Because  of  this,  good  DFD  decompositions  will  depend  on  the  intuitive  feelings  of 
the  software  designers.  Therefore,  a  systematic  methodology  is  really  needed  in  order  to  pro- 
duce quality  DFD  designs. 

Mike  Adler  developed  an  algebra  to  formalize  the  process  of  decomposition  based  on 
DeMarco's  representation  schema  [ADLER88].  It  can  be  used  as  a  guideline  or  a  tool  in  DFD 
decompositions.  To  prove  its  efficiency,  he  also  proposed  some  measures  to  try  to  evaluate  the 
quality  of  the  resultant  decomposed  DFDs.  There  are  still  some  problems  in  this  approach  that 
will  be  listed  as  following  : 

Problem  1    : 

By  using  this  approach,  many  DFD  decomposition  problems  become  trivial.  The  extreme 
situation  is  when  all  the  inputs  contribute  to  all  the  outputs.  No  decomposition  can  be 
done  in  this  case.  Unfortunately,  that  is  really  fairly  common  in  the  real  world.  We  have 
used  some  data  flow  designs  from  real  projects  in  this  research  to  apply  Adler's  approach 
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and  a  fair  amount  of  them  became  trivial.    This  means  that  his  approach  is  still  not  gen- 
eral enough  to  be  applied  to  real  design  problem. 

Problem  2    : 

The  measure  used  in  this  approach  to  judge  the  quality  of  the  decomposed  DFDs  has  not 
taken  some  important  criteria  that  are  usually  crucial  to  DFD  designs  such  as  complex- 
ity, interconnections,  or  cohesion  into  considerations.  There  is  no  measurement  involved 
in  this  evaluation  measure.  For  example,  his  criterion  for  a  decomposed  DFD  to  be 
"optimal"  is  "equivalent  to  initial  sentence  and  non-  trivial".  Here,  first  of  all,  he  didn't 
mention  from  which  sense  the  "optimal"  is  defined.  Secondly,  since  he  didn't  prove  the 
uniqueness  of  the  decomposition  (  actually,  there  is  no  uniqueness  because  of  the  lack  of 
uniqueness  of  some  of  the  operators  ),  there  can  be  more  than  one  decompositions  that 
satisfy  the  "optimal"  criterion. 

Problem  3    : 

When  the  decomposition  process  can  not  go  further  because  no  operator  can  be  applied, 
a  special  type  of  operator  (for  example,  the  weak  substitution)  can  be  used  to  add  extra 
data  flow  to  continue  the  process.  But  obviously,  the  semantics  of  the  data  flow  informa- 
tion has  been  changed  in  this  process  and  the  resultant  decomposition  of  this  algebra  will 
not  actually  be  optimal  because  extra  or  redundant  data  flow  will  be  included.  Adler 
admitted  in  his  article  that  "if  those  elements  are  not  already  in  the  matrix  [  The  matrix 
is  a  table  used  to  denote  the  relationship  of  inputs  and  outputs  of  the  DFD  ],  the  graph 
interpretation  of  the  transform  could  change"  [ADLER88]. 

Problem  4    : 

The  resultant  DFDs  have  so  called  "local  flows"  and  they  are  actually  interconnections  of 
DFDs.  But  in  his  approach,  these  interconnections  have  been  produced  without  knowing 
their  meanings  during  the  process  of  decompositions.  The  burden  of  finding  out  their 
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meanings  is  left  to  designers.  The  problem  is  that  it  is  even  harder  for  the  designers  to 
add  the  meanings  to  a  given  graph  correctly  than  to  produce  a  meaningful  decomposi- 
tion manually. 

The  author  thinks  that  the  evaluation  of  decompositions  should  rely  on  a  reliable  methodology 
of  evaluating  DFDs.  The  existing  problems  of  Adler's  approach  force  us  to  start  to  develop 
textual  representation  for  DFDs  with  the  purposes  of  1)  getting  a  more  general  abstract  form 
of  DFDs,  2)  reflecting  characteristics  of  DFDs  through  our  developed  representation,  and  3) 
providing  a  really  useful  basis  for  formalizing  the  evaluation  of  DFDs.  These  goals  can  be 
stated  more  specifically  as  requirements  of  the  textual  representation  of  DFDs  in  the  next  sec- 
tion. 

2.1.    Requirements  of  The  Textual  Representation 

Bearing  in  mind  the  purposes  of  developing  this  textual  representation  for  DFDs,  it  should 
satisfy  the  following  : 

One-to-One  Correspondence    : 

One-to-one  correspondence  should  exist  in  the  sense  that  the  textual  representation(s) 
obtained  from  a  specific  DFD  correspond  to  only  one  recovery  DFD. 

Monotonicity    : 

The  meaning  of  monotonicity  here  is  that  adding  to  a  DFD  will  make  the  textual 
representation  stay  the  same  or  increase.  Monotonicity  has  to  hold  for  textual  represen- 
tation because  of  our  hypothesis  of  this  research.  The  representation  is  another  form  (  or 
a  projection  )  of  a  DFD  so  it  will  be  valid  if  and  only  if  it  preserves  the  nature  of  the 
DFD.  Therefore,  whenever  a  DFD  becomes  more  complex,  the  corresponding  representa- 
tion should  reflect  the  change,  i.e.,  it  should  also  become  more  complex. 

Preserving  Information    : 
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The  representation  must  be  able  to  preserve  all  the  important  information  in  the  DFDs. 
For  example,  the  number  of  modules  in  the  DFD,  the  number  of  possible  data  flow  paths 
in  the  whole  graph,  or  the  number  of  interconnections  among  modules  of  the  DFD,  etc. 
Most  importantly,  the  semantics  of  the  DFD  such  as  the  relationship  among  data  flows. 

Ease  of  converting    : 

The  components  of  the  representation  and  the  components  of  the  DFD  should  have  fairly 
simple  correspondences,  i.e.,  it  should  not  be  hard  to  convert  a  DFD  into  its  textual  form 
and  vise  versa. 

If  we  can  successfully  develop  a  representation  satisfying  the  above  requirements,  then  it  will 
be  a  good  basis  for  systematically  deriving  measures  directly  from  the  representation.  We  give 
the  rules  of  converting  from  DFD  into  textual  representation  in  section  2.2. 

2.2.    Converting  Rules 

We  start  to  build  the  textual  representation  (  TR  )  of  a  DFD  by  choosing  one  of  its  paths  and 
then  constructing  the  TR  from  the  inputs  and  the  module  of  this  path.  From  choosing  different 
path,  we  will  get  different  TRs.  But  as  we  stated  in  One-to-One  correspondence  in  above  sec- 
tion, they  will  all  recover  the  same  DFD.  In  appendix  G,  we  give  several  examples  to  show 
this. 

We  will  give  the  syntax  of  TR  in  the  form  of  EBNF  after  some  fundamental  symbols  used  in 
TR  are  first  introduced.  Then  we  will  explain  the  semantics  of  some  more  complicated  symbols 
used  in  this  grammar  separately.  Examples  will  also  be  given  to  help  understanding  better. 
Figure  2.1  gives  an  example  of  rule  (1)  through  rule  (5). 

1)        Each  module  in  the  DFD  corresponds  to  an  arrow  with  its  name  on  the  top  of  the  arrow 
in  TR.   See  the  following  example. 
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module 


is  transformed  to      — > 


2)        Each  input  to  a  module  appears  to  the  left  of  the  arrow  for  the  module  and  different 
inputs  are  separated  by  comma  ','  in  TR 


DFD 


a 

3 

c 

■ 

■ 

b 

ri 

TR: 


*> 


a  ,  d    —  ->    c         a 


Figure  2.1     Example  of  converting  rules  (1)  --  (5) 

3)  Each   output   from  a  module   appears  to  the   right  of  the   arrow  for   the   module  and 
different  outputs  are  separated  by  vertical  bar  '|'  in  TR. 

4)  Each  external  input  of  the  DFD  is  quoted  by  "'"  in  TR. 

5)  Each  external  output  of  the  DFD  is  quoted  by  ""  in  TR. 

Based  on  the  fundamental  symbols  we  developed  above,  we  can  now  turn  to  introduce  the  syn- 
tax of  TR  before  other  rules  can  be  clearly  explained.  The  syntax  will  be  expressed  in  the  form 
of  EBNF  as  following  : 

<TR>  ::=   <INPUT>  <INTERNAL>  <OUTPUT> 


<MNAME> 

<LNTERNAL>  ::=    " >"     I       <SHARE> 
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<SHARE>    ::=    "{"    <TR>    <M0RE_SHARE1>    "}" 

<M0RE_SHARE1>  ::=   ":"    <TR>   <M0RE_SHARE2> 

<M0RE_SHARE2>  ::=   <M0RE_SHARE1>    |   e 

<INPUT>    ::=   <PARALLEL_DATA>  |  (  "'"  <MNAME>  '""  | 
<NAME>  |  <NAME>  "*"  |  "["  <NAME>  "]"  ) 
<MORE_INPUT>*  |   e 

<MORE_INPUT>  ::=    ","  (  '""  <MNAME>  "'"  |  <NAME>  | 
"["  <NAME>  "]"  |  <NAME>  »*"  | 
<NEST_TR>   ) 

<PARALLEL_DATA>  ::=   <PARALLEL_ITEM>   <M0RE_PARALLEL1> 

<M0RE_PARALLEL1>  ::=  »;"  <PARALLEL_ITEM>  <M0RE_PARALLEL2> 

<M0RE_PARALLEL2>  ::=  <M0RE_PARALLEL1>  |  e 

<PARALLEL_ITEM>  ::=   <NEST_TR>    |    <NAME> 

<NEST_TR>         ::=   "("    <TR>    ")" 

<OUTPUT>  ::=  ("""  <MNAME>  """  |  <NAME>  »*"  |  <NEST_TR>  ) 
<M0RE_0UTPUT1>*   <M0RE_0UTPUT2>*  |  e 

<M0RE_0UTPUT1>  ::=   "|H  (  •""•  <MNAME>  """  |  <NAME>  »*"  | 
<NEST_TR>  |  e  ) 
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<M0RE_0UTPUT2>  ::=    "||"  ("""  <MNAME>  ",m|  <NEST_TR>  |  e) 


->  <~ 

<NAME>        ::=   string    |    string    |    <MNAME> 


<MNAME>       ::=   string 

More  symbols  get  involved  in  this  grammar  such  as  "[  ]",  "*",  and  "||".  We  will  explain  their 
meanings  one  by  one  in  the  following  rules  and  will  also  show  examples  about  how  to  use 
them. 

6)  This  rule  concerns  with  the  meaning  of  "(  )"  symbol.  For  the  output  that  is  not  an 
external  output  element  but  a  middle  result  or  interconnection  of  the  DFD,  we  enclose  it 
and  its  continuations  (  i.e.,  another  textual  representation  )  by  a  pair  of  parenthesis. 
One  example  of  this  rule  is  shown  in  figure  2.2.  In  the  example,  'a'  is  an  external  input, 
element  "x"  is  an  external  output,  and  Ml  is  a  middle  result.  Therefore,  according  this 
rule,  we  put  Ml  with  its  continuation,  i.e.,  Ml  — >  "x",  into  a  pair  of  parenthesis  as  (Ml 
— >  "x").  The  whole  thing  will  be  considered  as  the  output  of  module  1.  Or,  construc- 
tively, for  this  DFD,  we  have  a  — >  Ml,  and  Ml  — >  x,  then  we  connect  two  parts 
through  the  common  item  Ml.  The  result  is  'a'  ->  (  Ml  -->  "x"  ). 
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DFD 


a 


M1 


TR 


1  2 

'a'     — >  (  M1   —  >  "x"  ) 


Figure    2.2    Example  of  rule  (6) 


7)  Multiple  data  usage  at  the  same  level  will  be  handled  by  using  braces.  In  data  flow 
diagrams,  sometimes  more  than  one  modules  will  accept  the  same  data  as  inputs  or  pro- 
duce the  same  outputs.  Several  examples  are  shown  in  figure  2.3.  In  example  (a),  two 
modules  accept  the  same  input  'a'  and  produce  the  same  output  'd'.  Example  (b)  shows 
that  two  modules  produce  the  same  output  but  accept  different  set  of  input,  while  (c) 
illustrates  the  opposite  situation  with  the  same  input  but  different  outputs.  The  braces  '{ 
}'  quote  the  parallel  modules  that  either  share  same  inputs  or  produce  same  outputs  at 
the  same  level  of  the  DFD.  Colon  ":"  is  used  to  separate  these  modules.  The  shared  data 
is  put  outside  of  the  braces,  inputs  at  the  LHS  and  outputs  at  the  RHS.  In  example  (a), 
'a'  and  "d"  are  input  and  output  of  both  module  1  and  2.  But  'b'  is  the  input  of  only 
module  1  and  "c"  is  the  output  of  only  module  2.  The  TR  for  (a)  shows  these  facts  and 
the  vertical  bar  after  output  "c"  means  that  the  output  list  has  not  finished  and  the  con- 
tinuation is  the  outsider  of  the  right  brace,  "d".  Example  (b)  and  (c)  show  how  to 
develop  the  TRs  when  only  either  input  or  output  is  shared  but  not  both. 
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DFD 


DFD: 


b 

1 

d 

a 

2 

c 

TR  :     'a'  fb'  — >  :  —  >  "c"  |  }  Md" 


1  2 

{'a'  — >  :    'b'  — >  }  "c" 


(a)    Mixed  situation 


(b)    Only  share  output 


DFD 


M1 


M3 


yi 


y2 


1  3  2  4 

TR  :  'a'  — >  "bH  |  (  M1   {  — >  "y2"  :  — >  (  M3,  "e"  —  >  My1"  )  }  ) 


(c)    Only  share  input 


Figure    2.3      Examples  of  converting  rule  (7) 
data  sharing  at  the  same  level 
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8)  Multiple  data  usage  at  different  levels  can  not  be  expressed  by  using  rule  7.  In  this  case, 
we  repeat  each  usage  in  the  TR  and  then  use  left  upper  arrow  and  right  upper  arrow 
over  data  names  to  indicate  the  direction  of  the  source  data.  Arrow  '<— '  means  that  the 
source  data  is  at  the  lower  level  or  the  left  direction.  While  '-->'  indicates  the  higher 
level  or  the  right  direction  of  the  source  data.  Without  the  upper  arrow,  we  will  not  be 
able  to  find  exactly  where  the  data  is  from.  Upper  arrows  always  refer  to  where  the  data 
is  produced  so  that  there  will  be  only  one  data  that  is  not  be  upper  arrowed  and  that  is 
exactly  the  place  where  this  data  is  produced.  In  the  example  shown  in  figure  2.4,  MO  is 
used  by  both  module  2  and  module  4  but  at  different  levels  of  the  DFD.  Therefore,  MO 
not  only  appears  twice  in  the  TR  but  also  the  MO  that  gets  into  module  4  has  an  upper 
left  arrow  on  its  top,  indicating  that  the  data  comes  from  the  first  left  appearance  of  MO 
(  where  MO  is  produced  by  module  1  )  but  not  the  output  of  module  3. 
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MO 
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fe. 
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M3 
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M1 

3 

fe. 

M2 
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z 

5 

TR 


15  2  3 

'a'  — >  (  M2  — >  Mz")  |  (  MO  — >  "x")  |  (  M1   — >  "w"  | 

(M3,    MO    — >    "y"    )) 
<--     4 


Figure  2.4    Example  of  converting  rule  (8) 
data  sharing  at  different  levels 


9)  The  symbol  ';'  in  the  grammar  is  used  to  express  the  parallel  data  flows  in  DFDs.  We 
first  present  its  definition  and  then  discuss  its  usages  by  giving  two  examples  in  figure  2.5 
and  figure  2.6.  In  the  grammar,  symbol  ";"  appears  : 
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<PARALLEL_DATA>    ::=   <PARALLEL_ITEM> 

<MORE_P  ARALLEL 1  > 

<M0RE_PARALLEL1>  ::=  ";"  <PARALLEL_ITEM> 

<M0RE_PARALLEL2> 

<M0RE_PARALLEL2>  ::=  <M0RE_PARALLEL1>  |  e 

where  non-terminal  PARALLEL_ITEM  can  either  be  a  single  data  name  ( 
<MNAME>  )  or  another  nested  TR  (  <NEST_TR>  ).  Suppose  we  have  A  ; 
B,  then  the  meaning  of  this  expression  can  be  stated  as  : 

The  most  left  element  of  A  (  if  A  is  <NEST_TR>  )  or  A  itself  (if  A 
is  <MNAME>  )  as  well  as  the  most  left  element  of  B  (  if  B  is 
<NEST_TR>  )  or  B  itself  (  if  B  is  <MNAME>  )  are  the  outputs  of 
the  nearest  arrow  on  their  left 

The  most  right  element  of  A  (  if  A  is  <NEST>TR>  )  or  A  itself  (  if 
A  is  <MNAME>  )  as  well  as  the  most  right  element  of  B  (  if  B  is 
<NEST_TR>  )  or  B  itself  (  if  B  is  <MNAME>  )  are  the  inputs  of 
the  nearest  arrow  on  their  right. 

We  will  furthermore  give  two  examples  to  clearly  show  how  it  can  be  used  in 

expressing  parallel  data  flows  : 

i)  See  figure  2.5  where  Ml  and  M2  are  parallel  data  flows  between  module  1  and 
module  2,  i.e.,  they  are  both  the  outputs  of  module  1  and  also  they  are  both 
the  inputs  of  module  2.  The  corresponding  TR  is  shown  below  the  DFD. 
According  to  the  definition,  since  both  Ml  and  M2  are  single  data  names  then 
Ml  and  M2  are  both  the  outputs  of  the  nearest  arrow  on  their  left  (  that  is 
module  1  ).  Also,  they  are  both  the  inputs  of  the  nearest  arrow  on  their  right 
(   that   is   module   2    ).     Actually,   their   nearest   left   arrow  is   the   module 
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producing  them,  and  their  nearest  right  arrow  is  the  module  accepting  both  of 
them  as  inputs.  This  is  a  typical  parallel  data  flow  between  two  consecutive 
modules,  different  data  flows  from  the  same  resource  and  to  the  same  destina- 
tion. 


DFD 


1 

b 

M1 

2 

c 

a 

w* 

M2 

M3 

TR: 


1  2 

'a'  — >  Mb"  |  (  M1     ;    M2  — >  "C"  |  (  M3 


Figure  2.5    Example  of  converting  rule  (9),  case  (i) 


ii)  Second  example  shows  the  usage  of  ";"  when  not  all  of  A  and  B  are  single 
data  names,  figure  2.6  gives  a  DFD  with  parallel  data  flows  among  different 
levels.  Both  Ml  and  M2  are  produced  by  module  1  but  they  are  sent  to 
different  modules  at  different  levels.  Ml  gets  into  module  2  and  it  results  M3 
which  is  used  with  M2  as  the  inputs  of  module  3.  Here,  we  say  parallel  data 
flow  in  the  sense  that  Ml  and  M2  are  from  the  same  source  and  then  the 
consequence  (  M3  )  of  one  of  them  (  Ml  )  is  used  together  with  the  other  one 
(  M2  )  as  the  inputs  of  one  module  (  module  3  ).  In  the  case  of  figure  2.6,  A  is 
M2  and  B  is  a  TR  (  Ml  — >  "x"  |  M3  ).  Again,  according  to  the  definition, 
M2  and  the  most  left  element  of  B  (  Ml  )  will  be  the  outputs  of  the  nearest 
arrow  on  their  left  (  module  1  ).  M2  and  the  most  right  element  of  B  (  M3  ) 
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will  be  the  inputs  of  the  nearest  arrow  on  their  right  (  module  3  ).  This  is 
exactly  the  situation  in  figure  2.6.  The  TR  using  ";"  does  reflect  the  seman- 
tics of  the  corresponding  DFD. 
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fe 
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M1 
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TR 


1  2  3 

'a'    — >  "b"  |  (  M2  ;  (  M1  — >  "x"  |  M3  )  — >  "y"  ) 

Figure  2.6    Example  of  converting  rule  (9),  case  (ii) 


10)  Rule  10  deals  with  the  data  that  have  the  same  name  but  different  semantics.  It  is  quite 
often  in  DFDs  that  different  modules  will  produce  data  with  the  same  name  that  finally 
go  to  different  places.  The  situation  here  is  different  from  the  data  sharing.  Data  sharing 
means  that  several  data  names  are  from  the  same  resource.  Actually,  the  data  from 
different  modules  are  semantically  different  even  though  they  have  the  same  name.  For 
example,  different  modules  update  different  attributes  of  symbol  table  in  compiler  design. 
In  this  case,  we  need  to  distinguish  those  data  in  TR  in  order  to  keep  the  correct  seman- 
tics of  the  corresponding  DFD.  The  strategy  of  doing  this  is  simple.  We  just  repeat  the 
name  without  referencing  (in  data  sharing,  all  data  except  the  original  one  is  upper 
arrowed  to  refer  to  the  resource).  Figure  2.7  clearly  shows  how  it  works. 
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•aVb'  —  >  (M1   —  >  ([M3],  ('c'/d1  -> 

4  5 

(M1   -->  (M2,  M3  — >  Mx"  ))))) 


Figure  2.7      Example  of  converting  rule  (10) 

The  meaning  of  '[  ]'  will  be  explained  later.  In  this  example,  both  module  1  and  module  2 
produce  data  Ml  but  they  are  semantically  different.  In  the  corresponding  TR,  we  have 
two  appearances  of  Ml  and,  obviously,  the  first  Ml  is  from  module  1  and  the  second  Ml 
is  from  module  2  (not  from  the  same  resource  !). 

11)  Loop  is  another  hard  thing  to  be  represented  in  TR,  especially  the  nested  loops.  In  our 
grammar,  we  use  two  symbols,  "||"  and  "*"  to  represent  data  loops  in  DFDs.  Before  we 
introduce  their  meanings,  it  will  be  helpful  to  discuss  the  natures  of  loops  first.  If  we 
have  a  loop  shown  in  figure  2.8,  several  facts  need  to  be  explicitly  represented  in  our  TR 

i)        The  Ml  from  module  1  is  semantically  different  from  the  Ml  from  module  2 

ii)       The  loop  itself  which  is  shown  in  figure  2.9  (a) 

iii)      The  final  exit  of  the  loop  which  is  shown  in  figure  2.9  (b) 
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•a'    — >    (  M1  — >  M1*    ||    "y"    ) 


Figure    2.8    Example  of  converting  rule  (11) 


M1 

2 

M1 


(a)  The  loop  body 


(b)    The  final  exit  of  the  loop 


Figure    2.9    Example  of  conveting  rule  (11) 


The  two  symbols  are  used  to  explicitly  describe  above  semantics  of  the  DFPs  : 

a)  Symbol  '||'  means  that  the  following  element  is  the  final  exit  of  a  loop 

b)  Symbol  '*'  will  be  used  to  mark  the  data  which  form  the  loop. 

By  using  those  two  symbols,  we  can  get  the  TR  for  the  DFD  of  the  example  in  figure  2. 
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In  the  TR,  the  first  Ml  is  the  output  of  module  1  and  the  second  Ml  is  the  output  of 
module  2.  Then,  the  second  Ml  is  linked  back  (by  the  meaning  of  symbol  '*')  to  the 
place  where  the  first  Ml  appears  to  complete  the  loop.  For  module  2,  it  initially  gets  the 
output  of  module  1  as  input  and  later  accepts  the  Ml  produced  by  itself  as  input.  The 
loop  process  repeats  till  the  exit  'y'  is  reached. 

12)  We  start  to  build  a  TR  by  choosing  one  of  the  paths  in  the  DFD.  In  the  example  in 
figure  2.7,  we  start  from  module  1  and  this  path  can  continue  till  module  5  is  reached 
because  another  input  M2  of  module  5  is  from  the  other  path.  Therefore,  when  we 
approach  the  output  M3  of  module  3,  we  have  to  start  to  process  the  other  path  in  order 
to  get  data  M2  as  another  input  together  with  M3  to  module  5.  So,  we  need  to  postpone 
the  usage  of  M3  at  this  point  till  M2  is  produced.  Symbol  '  '  is  used  for  this  purpose 
and  it  will  postpone  the  usage  of  the  data  quoted  by  '['  and  ']'  till  the  next  appearance  of 
the  same  data  name.  In  the  example  shown  by  figure  2.7,  the  usage  of  the  first  appear- 
ance of  M3  (quoted  by  '[]')  is  postponed  till  the  next  appearance  of  M3  which,  together 
with  M2,  is  the  input  of  module  5.  If  we  choose  module  2  to  start,  then  we  will  need  to 
postpone  the  usage  of  M2  to  wait  for  data  M3.  Although  we  will  get  different  TR  in  this 
case,  we  proved  in  Appendix  G  that  they  will  recover  the  same  DFD. 

2.3.    Example  and  Conclusions 

So  far,  we  have  introduced  13  converting  rules.  With  them,  different  kinds  of  DFDs  can  be 
easily  transferred  into  their  TR  forms.  Figure  2.10  is  another  fairly  complicated  DFD  example 
and  we  will  use  the  rules  developed  above  to  give  its  TR  representation. 
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Figure    2.10    Example  of  using  converting  rules 

This  is  a  symmetric  data  flow  diagram  with  loops  and  parallel  data  flows.  The  corresponding 
textual  representation  could  be  : 


1  3  <-   2 

V,M4  — >  ([M0],('b',M5  - >  [M3]  |  (M1.M0  — > 

<-   4 

M4*  ||  »x"  ))  |  (M2.M3  — >  M5*||"yH) 

As  we  discussed  in  One-to-One  correspondence,  symmetric  DFD  can  get  different  TRs  but  they 
will  turn  to  the  same  recovery  DFD.  In  appendix  G,  we  developed  two  different  TRs  for  this 
example  and  used  them  to  recover  the  same  DFD. 

In  order  to  test  the  effectiveness  of  TRs,  with  the  help  of  I/O  matrix,  we  can  prove  that  TRs 
can  recover  the  same  matrix  as  the  original  one  upon  which  the  corresponding  DFD  is 
developed  (see  Appendix  G). 
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A  TR  is  a  linear  representation  of  a  two  dimensional  DFD.  But  it  preserves  all  the  structural 
information  of  a  DFD,  including  the  number  of  modules,  number  of  interconnections,  the  inter- 
facing patterns  of  them,  the  possible  paths,  and  the  relationships  among  data  flows.  The  con- 
verting rules  are  developed  with  the  purposes  of  keeping  those  information  in  TRs  and  building 
a  good  basis  of  calculating  structural  measures  from  TRs.  Next  chapter  will  discuss  the  meas- 
urement issue  of  TRs. 
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Chapter  3 
Structural  Measures  of  Textural  Representation 


Another  important  task  of  this  research  is  to  provide  some  useful  measures  to  aid  the  evalua- 
tion of  DFD  designs.  The  measures  developed  in  this  chapter  can  either  be  used  as  the  indica- 
tions of  the  design  quality  of  DFDs  or,  furthermore,  serve  as  a  quantitative  guide  in  the  pro- 
cess of  DFD  decompositions. 

Much  of  the  software  measures  research  has  concentrated  on  source  programs  and  has  provided 
quantitative  means  of  assessing  the  complexity,  cost,  and  reliability  of  the  resultant  code.  But 
this  research  emphasizes  the  early  phase  of  software  design  because  we  think  that  the  early 
evaluation  of  the  software  designs  can  lead  to  significant  improvements  in  software  quality  and 
a  significant  decrease  in  development  cost. 

The  data  flow  information  of  a  program  has  been  used  for  measuring  program  characteristics  in 
two  different  approaches.  One  is  to  use  the  data  flows  among  modules  to  define  the  nature  of 
the  interrelationships  of  these  modules  ([HENRY811],  [HENRY812]).  Another  one  is  to  use  the 
data  flows  within  a  module  to  describe  the  nature  of  the  program  ([OVIED80],  [rYENG82]).  In 
this  research,  measures  will  be  developed  according  to  the  data  flows  among  modules  (  the  first 
approach  )  so  that  the  evaluation  can  be  done  based  on  the  structure  of  the  whole  diagram 
instead  of  just  the  individual  modules. 

In  addition,  the  research  puts  attention  to  the  structural  measures  of  DFDs  without  concerning 
any  criterion  which  is  related  to  the  semantics  of  the  project.  The  reason  for  that  is  that  DFDs 
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themselves  only  depict  the  flow  of  data  rather  than  flow  of  control.  They  treat  data  as  informa- 
tion but  do  not  distinguish  what  kind  of  data  is  being  used  [TROY81].  Therefore,  the  meas- 
urements developed  in  this  chapter  will  not  be  concerned  with  the  type  of  the  data  but  only 
with  the  data  flow  pattern,  i.e.,  we  don't  care  about  how  to  control  the  data  flow  in  the  DFDs 
but  only  care  about  where  the  data  flows  come  from  and  where  they  go  in  the  DFDs. 

Because  of  the  strategy  we  adopted  in  the  research,  we  want  to  measure  whatever  we  see  from 
the  structure  of  the  DFDs  but  not  from  the  semantics  of  the  DFD.  The  textural  representation 
of  a  DFD  we  developed  in  chapter  two  is  a  good  basis  of  calculating  structural  measures 
because  it  is  exactly  a  structural  projection  of  a  DFD  and  keeps  all  the  structural  characteris- 
tics of  the  DFDs. 

Basically,  we  try  to  develop  the  measures  which  will  be  useful  in  evaluating  DFDs  according  to 
some  criteria  such  as  the  structural  complexity  of  the  DFDs,  interconnections,  etc.  The  later 
sections  will  introduce  the  basic  measures  counted  from  TRs  and  some  advanced  measures 
built  to  reflect  the  specific  natures  of  the  data  flow  diagram  design.  These  measures  will  also  be 
developed  under  the  second  hypothesis  stated  in  chapter  one. 

3.1.    Basic  Measures  Counted  From  TRs 

Before  we  start  to  introduce  the  basic  measures  obtained  from  TRs,  we  will  show  examples  of 
several  terms.  This  will  help  to  avoid  confusion  and  help  to  explicitly  explain  the  background. 

The  first  one  is  the  "width  of  the  data  usage'.  The  definition  given  in  chapter  one  is  "the 
number  of  the  occurrences  of  an  independent  variable  in  the  DFD".  Here,  attention  must  be 
paid  to  the  difference  between  this  concept  and  'fan-out'  concept  of  Henry-Kafura's.  The  exam- 
ple in  figure  3.1  shows  the  differences. 


29 


Figure  3.1    Example  of  'width  of  the  data  usage' 

In  figure  3.1,  'a'  is  an  independent  variable.  According  to  the  definition,  the  width  of  a's  usage 
will  be  5  since  it  appears  five  time  in  the  graph.  Henry- Kafur a's  fan-out  would  only  be  three  in 
this  case. 

The  second  is  concerning  with  term  "path".  The  example  in  figure  3.2  more  clearly  explains 
the  meaning  of  the  definition  for  a  path.  For  the  DFD,  the  unique  sequence  of  variable/module 
names  that  goes  from  an  external  input  to  an  external  output  will  be  "aXbYc".  Here,  a,  b, 
and  c  are  variables,  and  X  and  Y  are  module  names.  The  sequence  defines  the  order  in  which  a 
specific  data  flow  passes  through  different  components  of  the  DFD. 
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Corresponding  path  is  :  a  X  b  Y  c 
Figure  3.2   Example  of  a  path 

Following  is  the  list  of  basic  measures  we  can  obtain  directly  from  TR  : 

1)  UM  - 

Number  of  unique  arrows  in  the  representation  or  number  of  unique  boxes  (modules)  in 
the  graph 

2)  SM  - 

Number  of  modules  that  share  a  common  input  data  with  other  modules 

3)  LM  - 

Number  of  modules  that  are  in  a  loop 

4)  UI- 

Number  of  unique  external  inputs 


5)  Cli 


Number  of  occurrences  of  the  ith  unique  external  input.  The  attention  has  to  be  paid  to 
both  symbol  '{}'  and  '|  ]'  when  we  count  this  measure  from  TRs.  Each  modules  in  '{}' 
share  the  input  outside  of  the  brace  so  that  the  occurrences  of  the  outside  input  should 
be  the  number  of  modules  within  the  braces.  The  same  thing  for  the  count  of  occurrences 
of  outside  outputs.  Any  data  which  is  quoted  by  '[  ]'  should  not  be  counted  because  sym- 
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bol  '[  ]'  postpones  the  data  usage.  The  occurrence  will  be  counted  when  it  is  really  used. 

6)  UO  - 

Number  of  unique  external  outputs 

7)  COi  -- 

Number  of  occurrences  of  the  ith  unique  external  output  (see  (5)  ) 

8)  CTMi  ~ 

Number  of  tokens  related  to  the  ith  module,  including  input  tokens  and  output  tokens 

9)  UIC  - 

Number  of  independent  interconnections  (the  unique  variables  that  are  not  quoted  either 
by  "  or  "  ") 

10)  CICi  -- 

Number  of  occurrences  of  the  ith  interconnection  variable  (see  (5)  ) 

11)  CIC  - 

Total  number  of  occurrences  of  all  interconnections  5],(C7Ci) 

12)  UIV  - 

Number  of  independent  variables  in  the  graph,  including  inputs,  output,  and  interconnec- 
tions. It  is  calculated  by  UI  +  UO  +  UIC 

13)  CIV  - 

Total        number        of        occurrences        of        independent        variables        which        is 
E,  (CIt)  +  J]t(COz)  +  CIC 

14)  P  -- 

Number  of  possible  data  flow  paths  in  the  graph  (  from  every  external  input  to  every 
external  output  ) 
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15)  CW  - 

Conceptual  width  of  the  graph  (The  biggest  degree  of  data  usages  ).  It  is    Max(CICi,  Cli, 
COi) 

16)  NPMi  -- 

Number  of  paths  that  go  through  the  ith  module  (box) 

17)  DPi  - 

The  length  of  the  ith  path  (  the  number  of  components  of  the  DFD  the  ith  path  has  to 
pass  through  ) 

18)  DP  - 

The  conceptual  length  of  the  graph  (  The  longest  path  in  the  graph  ).  It  is    Max  (  DPi  ). 

19)  CL  -- 

Number  of  loops  in  the  graph 

20)  LLi  - 

The  depth  of  the  ith  loop  (  the  number  of  modules  that  are  involved  in  the  ith  loop.  For 
example,  for  a  self-loop,  LLi  =  1  ) 

21)  CSIZE  - 

Conceptual  size  of  the  graph  which  is  the  product  of  the  conceptual  width  and  concep- 
tual depth  of  the  graph  CW  *  DP 

All  above  measures  can  be  easily  obtained  from  textural  representations  of  DFDs.  These  meas- 
ures describe  different  characteristics  of  DFDs  and  form  a  good  foundation  for  designing 
advanced  measures  that  are  the  indications  of  software  design  qualities.  The  following  section 
discusses  the  criteria  we  will  use  in  evaluating  DFDs,  the  factors  that  will  influence  the  cri- 
teria, in  the  way  they  will  affect  the  quality  of  DFDs,  and  how  we  build  measures  to  reflect  all 
the  above. 
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3.2.    Criteria  in  Evaluating  DFDs 

Several  criteria  will  be  used  to  evaluate  data  flow  diagrams  :  complexity,  interconnections, 
modularity,  cohesion,  and  ease  of  implementation.  These  criteria  are  actually  related  to  or 
influence  each  other  in  the  sense  that  an  increase  of  one  will  often  cause  the  same/different 
direction  of  change  in  the  others.  Therefore,  when  we  consider  the  proper  measures  for  the  cri- 
teria, we  will  also  take  this  fact  into  count  in  order  to  correctly  model  the  real  situations.  We 
will  first  introduce  each  criterion  in  detail  and  then  try  to  use  appropriate  measures  to  model 
it. 

Interconnections    : 

For  ease  of  understanding,  we  discuss  interconnections  first.  This  criterion  can  be  con- 
sidered as  an  indication  of  the  strength  of  coupling  among  modules.  The  strength  of  cou- 
pling can  be  affected  by  following  reasons  : 

a)  Average  token  burden.  The  number  of  tokens  through  each  module  in  the 
diagram  is  related  to  the  coupling  strength.  The  larger  the  average  number 
of  tokens  that  are  processed  by  each  module  in  the  DFD,  the  stronger  the 
coupling.  This  can  be  formulated  as  :  the  total  number  of  tokens  processed  in 
the  DFD  divided  by  the  number  of  modules  in  the  DFD,  i.e., 

ATB  =  Yli(CTMi)/UM 

b)  Average  connection  burden.  The  coupling  increases  with  the  number  of  inter- 
face connections  in  the  DFDs  [TROY82].  The  larger  the  average  number  of 
interconnections  each  module  creates  in  the  DFD,  the  stronger  the  coupling. 
This  can  be  calculated  by  the  following  formula,  the  total  number  of  intercon- 
nections in  the  DFD  divided  by  the  total  number  of  modules  in  the  diagram  : 
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ACB  =  CIC  /  UM 

c)  The  interfacing  pattern  of  these  interconnections.  Again,  coupling  strength 
increases  with  the  increasing  complexity  of  the  interface  among  modules  in  a 
DFD  [TROY82].  It  can  be  furthermore  considered  from  several  perspectives  : 

Average  path  burden.  The  way  the  interconnections  are  related  in  a 
DFD  determines  the  number  of  possible  paths  in  the  DFD.  The 
larger  the  average  number  of  paths  that  go  through  each  module  in 
the  DFD,  the  stronger  the  coupling.  This  can  be  measured  by  : 

APB  =  P  /  UM 

that  is,  the  total  number  of  possible  paths  in  a  DFD  divided  by  the 
total  number  of  modules  in  the  DFD. 

Data  sharing  degree.  Coupling  also  increases  when  more  than  one 
module  interfaces  with  the  same  data,  i.e.,  share  a  common  environ- 
ment [TROY82].  We  define  this  degree  based  on  two  considerations. 
One  is  the  percentage  of  the  total  number  of  modules  in  a  DFD  that 
share  common  data  environment  (MDSR).  The  other  one  is  the  per- 
centage of  the  independent  variables  that  are  shared  in  the  DFD 
(DSR).  These  two  factors  are  defined  by  the  following  equations, 
separately  : 

MDSR  =  (SM  +  UM)  /  UM 
DSR  =  CIV  /  UIV 

Then,  the  data  sharing  degree  is  simply  the  summation  of  these  two 
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Complexity 


DS  =  MDSR  +  DSR 

Loop  density.  This  density  is  viewed  also  from  two  aspects,  average 
loop  length  : 

ALEN  =  ((£,(  LLi  ))  /  CL)  +  1 
and  loop  frequency  shown  as  below  : 

LF  =  (LM  +  UM)  /  UM 

Finally,  the  loop  density  can  be  defined  as  following  : 
i 

LD  =  LF  *  ALEN 

All  of  the  above  factors  proportionally  influence  the  coupling  strength  of  the 
DFD.  Therefore,  we  use  the  summation  of  them  to  reflect  the  nature  of  inter- 
connection of  DFDs  : 

INTER  =  ATB  +  ACB  +  APB  +  DS  +  LD 


Measuring  whatever  we  see  from  the  DFDs  is  one  of  our  goals.  Therefore,  complexity  of 
DFDs  are  affected  by  their  size  and  interconnection  characteristics  since  they  are  the 
only  things  we  can  see  and  measure  from  DFDs.  We  have  discussed  the  measure  of  inter- 
connection. In  the  last  section,  we  introduced  the  measurement  for  conceptual  size  of  the 
DFDs  which  is  the  product  of  conceptual  width  and  length  of  the  DFDs.  We  here  furth- 
ermore define  the  complexity  measure  for  a  DFD  to  be  the  product  of  its  conceptual  size 
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and  its  interconnection  measure  which  is  formulated  as  following  : 

COMP  =  CSIZE  *  INTER 

Modularity    : 

Modularity  is  the  extent  to  which  software  is  composed  of  discrete  components  such  that 
a  change  to  one  component  has  minimal  impact  on  other  components  [STAND87].  There- 
fore, modularity  actually  is  a  measure  of  the  relative  independence  among  modules.  Usu- 
ally, fewer  connections  among  modules  indicates  a  better  module  independence  and  thus 
a  better  modularity.  It  could  be  measured  by  the  average  number  of  variable  occurrences 
over  modules  which  is  formulated  as  the  following  : 

MOD  =  CIV  /  UM 

Cohesion    : 

In  this  study,  we  are  only  concerned  with  two  kinds  of  cohesions,  one  is  functional  cohe- 
sion and  the  other  one  is  logical  cohesion.    We  again  will  consider  them  separately. 

a)  Functionally  cohesive  modules  should  (  ideally  )  do  just  one  task,  i.e.,  it 
should  have  singularity  of  tasks.  This  is  similar  to  the  concept  of  measuring 
dependence  among  modules,  cohesive  modules  are  more  independent  with 
necessary  connections  to  other  modules.  Therefore,  we  think  that  this  cri- 
terion should  be  somehow  related  to  the  evaluation  of  modularity.  Instead  of 
denning  a  measure  for  this  criterion,  we  will  test  this  relationship  by  statisti- 
cal analysis  later  stated  in  chapter  four.  If  we  can  confirm  the  relationship, 
the  measure  can  be  designed  based  on  the  conclusion  from  there. 

b)  In  this  research,  we  define  the  logical  cohesion  as  a  logical  strength  of  the 
whole  diagram,  i.e.,  see  how  strongly  the  modules  are  related  to  each  other. 
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Low  logically  cohesive  diagrams  do  unrelated  tasks. 

Again,  we  feel  that  this  criterion  might  be  affected  by  interconnection.  Unre- 
lated modules  in  a  DFD  usually  means  that  the  DFD  carries  out  unrelated 
tasks.  Also,  the  high  interconnections  usually  indicates  that  modules  are 
tightly  related.  We  intend  to  explore  the  relationship  between  logical  cohe- 
sion and  the  other  criteria  such  as  interconnections.  How  to  design  this  meas- 
ure will  depend  on  the  data  analysis  results  explained  in  chapter  four. 

Ease  of  Implementation    : 

We  believe  that  criterion  'ease  of  implementation'  is  actually  the  proportional  function  of 
complexity.  The  more  complex  a  DFD  is,  the  harder  the  implementation  will  be.  The 
relationship  will  be  assessed  through  the  experiment  reported  in  chapter  four. 

3.3.    Examples  of  Calculated  Measures 

In  order  to  present  how  to  calculate  the  measures  designed  in  previous  section,  we  choose  six 
DFDs  (they  are  the  ones  used  as  the  objects  in  the  experiment  introduced  in  chapter  four),  cal- 
culate their  corresponding  basic  counts,  measures,  and  show  them  in  Appendix  A.  The  rows  are 
the  lists  of  external  inputs/outputs,  interconnections,  modules,  paths,  and  loops.  The  columns 
are  the  corresponding  basic  counts,  such  as  unique  modules  UM,  the  sharing  modules  SM,  etc. 
Those  rows  and  columns  form  a  table  for  each  DFD.  Some  of  the  tables  are  too  big  to  put  on 
one  page,  so  they  are  shown  on  different  sheets.  From  the  numbers  marked  for  rows  and 
columns,  it  is  easy  to  recognize  each  part  of  a  table. 

Based  on  the  basic  counts  of  the  six  DFDs,  various  measures  are  calculated  for  different  cri- 
teria. They  are  shown  in  Table  3.1.  The  rows  correspond  to  six  DFDs  and  the  columns  are  the 
measures.  For  example,  for  the  first  DFD,  its  ATB  is  2.71,  ACB  is  0.43,  APB  is  1.86,  COM  is 
194.84,  and  INTER  is  9.74. 
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Chapter  4 
Survey  in  this  research 


In  chapter  three,  we  developed  structural  measures  for  DFDs  in  order  to  evaluate  their  quality. 
The  question  is  how  effective  these  measures  will  be  when  we  use  them  to  evaluate  real  DFDs. 
In  order  to  test  their  validities,  we  conduct  a  survey  to  empirically  prove  their  significance. 

There  are  two  goals  in  this  survey.  They  are  : 

1)  Empirically  validate  the  structural  measures  developed  in  chapter  three  by  using  statisti- 
cal analysis  techniques 

2)  Build  up  fuzzy  membership  functions  for  linguistic  concepts  (terms)  used  in  evaluating 
DFDs  based  on  the  survey  data  and  the  validation  result  from  1) 

The  hypothesis  of  the  first  step  is  that  a  valid  measure  will  preserve  the  "correct  order"  of  the 
DFDs  (see  assumption  2  in  chapter  one).  While  the  hypothesis  of  the  second  step  is  that  the 
meanings  of  all  terms  in  a  natural  language  are  to  a  lesser  or  greater  degree  vague,  such  that, 
the  boundary  of  the  application  of  a  term  is  never  a  point  but  a  region  where  the  term  gradu- 
ally moves  from  being  applicable  to  being  nonapplicable  [HERSH76]. 

This  chapter  will  give  detail  discussions  of  the  first  goal.  It  will  introduce  the  survey  environ- 
ment, data  collection,  data  analysis,  and  results  from  the  data  analysis.  Chapter  five  will  dis- 
cuss the  issues  related  to  fuzzy  set  theory  and  its  application  in  this  research,  and  present  the 
results  of  the  second  step. 
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4.1.  Survey  Environment 

This  survey  was  done  at  Kansas  State  University,  Department  of  Computing  and  Information 
Sciences.  It  was  performed  by  choosing  a  set  of  objects,  presenting  them  to  the  chosen  sub- 
jects, and  asking  the  subjects  to  evaluate  the  objects  according  to  some  predefined  criteria. 
The  subjects  and  the  objects  were  selected  in  the  following  situations  : 

Subjects    : 

The  resource  of  the  survey  data  was  two  software  engineering  classes  of  Computing  and 
Information  Sciences  Department  at  KSU,  one  was  graduate  level  class  CIS  740  and  the 
other  one  was  undergraduate  level  class  CIS  541.  The  reason  we  chose  these  classes  was 
because  the  students  in  these  classes  were  familiar  with  DFDs  and  the  terminologies 
related  to  this  research.  The  total  number  of  subjects  was  53. 

Objects    : 

The  objects  were  6  sets  of  parent  boxes  and  the  expanded  data  flow  diagrams.  The 
DFDs  chosen  for  the  survey  had  different  degrees  of  complexity  within  each  measurement 
category  such  as  interconnection,  complexity,  etc.  Also,  their  semantics  should  be  hidden 
from  the  subjects  because  a)  we  are  only  concerned  the  structural  evaluation,  and  b)  the 
pilot  study  done  at  the  early  stage  of  this  research  showed  that  people  who  know  the 
semantics  of  the  DFDs  can  not  structurally  evaluate  them  properly.  Six  objects  are 
selected  in  this  way.  They  are  presented  in  appendix  B. 

4.2.  Data  Collection 

Data  collection  includes  both  data-collection  tool  and  collection  procedure.  They  will  be  dis- 
cussed in  the  following  : 

Data-collection  tool    : 

The    data-collection    tool    used    in    the   survey    is    the   data-collection   form.    The   form 
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addresses  9  criteria  for  evaluating  DFDs  and  has  10  questions  in  it.  The  questions  are 
designed  to  satisfy  the  following  two  conditions  : 

i)  Explicitly  define  the  meaning  of  each  criterion  before  the  questions  are  asked 
to  avoid  misunderstandings 

ii)  In  the  definitions  of  the  criteria,  avoid  giving  hints  about  how  the  related 
measurements  would  possibly  be  constructed  so  that  the  answer  will  not  be 
led  by  the  questioner 

In  the  data-collection  form,  for  each  question,  there  are  6  possible  answers.  Each  answer 
represents  a  evaluative  adjective  phrase  which  indicates  a  certain  degree  of  complexity  of  the 
related  criterion.  The  six  answers  are  ordered  along  a  favorable-neutral-unfavorable  continuum. 
For  example,  for  complexity  criterion,  the  six  answers  can  be  'very  complex', 'fairly 
complex', 'more  complex  than  simple',..,  till  'very  simple'.  In  the  survey,  criterion  complexity 
is  asked  twice,  the  first  and  the  last.  The  purpose  of  this  is  to  give  subjects  another  chance  at 
the  end  to  adjust  their  answer  based  on  their  overall  feeling  of  the  objects  (see  appendix  C). 

Data  Collection  Procedure    : 

We  presented  the  survey  during  class  time.  After  presenting  the  purpose  of  the  survey, 
we  asked  students  to  read  the  definitions  of  the  criteria  and  to  ask  questions  about  them 
if  any.  Again,  when  we  answered  questions,  we  tried  to  avoid  giving  hints  about  how  to 
choose  an  answer  or  how  possibly  the  measure  would  be  built.  The  answer  session  began 
when  there  was  no  question  left  about  the  definitions.  We  showed  each  chosen  object 
(DFD)  by  slide  and  asked  students  to  answer  all  the  10  questions  about  the  current  slide. 
The  process  continued  till  all  objects  were  examined.  Every  student  signed  his/her  name 
on  the  data-collection  form  in  case  we  needed  to  ask  specific  questions  about  the  answers. 

After  the  survey,  the  result  were  entered  into  a  spread  sheet  under  the  software  Excel.  One  of 
the  answer  forms  was  dropped  due  to  too  many  missing  values  in  it.  Therefore,  the  final  valid 
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number  of  subjects  was  52.  Next  section  will  examine  the  data  processing  process  and  present 
the  results  from  the  statistical  analysis. 

4.3.    Data  Analysis  and  Results 

In  Appendix  D,  there  are  5  spread  sheets  corresponding  to  5  questions  that  we  are  interested 
in  the  research.  The  rows  represent  subjects  and  the  columns  represent  objects.  The  values  in 
the  spread  sheet  are  the  symbols  for  the  answers.  For  example,  in  the  first  spread  sheet,  value 
1  stands  for  answer  'very  complex'  and  2  stands  for  answer  'fairly  complex'.  Here,  these  values 
keep  the  order  of  the  original  phrases.  This  fact  makes  statistics  a  possible  tool  in  the  later 
data  analysis. 

Several  analysis  techniques  were  used  in  data  processing  for  different  purposes.  They  are 
separately  discussed  in  the  following  sections. 

Simple  Statistics    : 

In  order  to  check  the  consistency  of  the  answers  and  the  bias  of  each  subject,  several  sim- 
ple statistics  were  calculated  such  as  average,  standard  deviation,  and  frequencies.  For 
each  row  in  the  spread  sheet,  the  average  was  calculated  to  show  the  basic  attitude  of 
the  corresponding  subject.  Some  people  are  always  more  optimistic  than  others  and  some 
people  are  always  more  pessimistic  than  others.  This  average  will  provide  us  a  good 
review  of  their  original  bias  and  it  is  the  base  for  normalizing  the  answers.  In  this 
research,  we  emphasize  evaluating  the  relative  complexity  levels  of  DFDs  hence  we 
finally  need  to  adjust  all  people's  basic  attitudes  to  the  same  ground.  This  is  the  normali- 
zation problem  and  we  will  talk  about  it  in  the  next  section. 

Another  average  was  calculated  based  on  each  column.  It  represents  the  average  evalua- 
tion of  a  criterion  of  the  corresponding  object.  If  different  columns  have  different  average 
values,  it  means  that  different  objects  get  different  evaluations  on  the  criterion  scale. 
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This  average  is  shown  on  row  71  on  the  spread  sheet. 

Also  based  on  each  column  (each  object),  frequencies  were  calculated  to  show  the  distri- 
bution of  the  answers.  They  are  shown  from  row  62  till  row  67.  This  is  good  for  checking 
the  consistency  among  the  answers.  For  example,  on  the  first  spread  sheet  (addressing 
criterion  'complexity'),  the  number  22  in  cell  C-64  means  that  22  people  (out  of  52) 
responsed  '3'  (more  complex  than  simple)  for  the  first  object.  Also,  if  different  columns 
have  different  peaks  on  their  distribution,  it  means  that  different  objects  get  different 
evaluations  on  this  criterion. 

Normalization    : 

In  order  to  adjust  all  the  answers  to  the  same  base,  we  normalized  the  raw  data.  We  sub- 
tracted from  every  answer  the  average  value  of  the  corresponding  row,  i.e.,  we  adjusted 
the  original  values  to  the  distances  from  their  mean  values.  In  this  way,  we  diminished 
the  bias  but  still  preserved  the  same  order  of  the  evaluations  which  is  more  important. 

The  normalized  data  is  shown  in  Appendix  E.  Frequencies  were  also  calculated  for  the 
normalized  data  based  on  the  six  subranges  defined  on  the  range  of  the  normalized 
answers.  In  Appendix  E,  for  each  question  spread  sheet,  row  73  and  74  indicate  the  max- 
imum and  minimum  values  of  the  corresponding  columns  and  then  the  Max  value  on  row 
59  is  the  maximum  value  of  the  whole  spread  sheet  (  the  maximun  value  of  row  73  )  and 
the  Min  value  indicated  on  row  59  is  the  minimum  value  of  the  whole  spread  sheet  (  the 
minimum  value  of  row  74  ).  Then  the  whole  range  of  the  answer  will  be  [Min,  Max].  The 
length  of  the  whole  range  is  indicated  by  Total  on  row  59.  Dividing  the  whole  length  by 
six  will  be  the  length  of  each  subrange.  The  value  on  row  60  is  this  length.  In  this  way, 
we  get  six  subranges  representing  the  six  evaluative  adjective  phrases.  Based  on  these 
subranges,  frequencies  were  calculated  again,  counting  the  number  of  normalized  answers 
that  fall  in  a  certain  subrange. 
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The  normalized  data  shows  slightly  different  distribution  than  the  raw  data.  But  again, 
the  rank  of  the  evaluations  is  preserved. 

Statistical  Analysis    : 

Based  on  the  averages  across  the  subjects,  statistical  analysis  was  applied  by  using  statis- 
tics package  SAS.  The  reason  of  using  averages  (but  not  the  original  data)  is  that  the 
averages  are  more  smooth.  Even  though  using  averages  will  lead  to  a  small  sample  size, 
we  adopted  it  because  they  are  calculated  from  fairly  big  sample  size  (52)  so  that  they 
are  stable  enough  for  the  statistical  analysis. 

The  data  shown  on  page  F-l  and  F-2  is  the  data  file  used  by  SAS.  The  columns  in  the 
file  correspond  to  the  addressed  questions  in  this  research  as  well  as  some  measures  calcu- 
lated from  the  chosen  objects  according  to  the  definitions  in  chapter  three.  The  first  two 
rows  indicate  the  relationship  between  SAS  variable  names  and  their  meanings.  For 
example,  second  column  is  'complexity'  whose  variable  name  in  SAS  file  will  be  VI.  The 
rows  in  the  SAS  file  correspond  to  the  objects  used  in  the  survey. 

We  need  to  validate  1)  the  effectiveness  of  the  measures  we  developed  in  chapter  three 
and  2)  the  relationship  among  the  different  criteria.  Several  hypothesis  related  to  these 
purposes  are  stated  as  following  : 

a)  complexity  measure  is  valid  according  to  the  definition  in  assumption  (2)  in 
chapter  one 

b)  interconnection  measure  is  valid  under  assumption  (2) 

c)  modularity  measure  is  valid  under  assumption  (2) 

d)  ease   of  implementation   is   proportionally   related   to   complexity   and/or   to 
modularity 

e)  complexity  is  somehow  related  to  modularity  and/or  cohesion 
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Two  basic  techniques  were  used,  one  was  correlation  analysis  and  the  other  was  regres- 
sion analysis.    They  are  discussed  below. 

Correlations    : 

A  complete  correlation  matrix  across  all  the  variables  in  SAS  file  was  built  to  show  the 
correlations  between  different  pairs  of  variables.  It  is  on  page  F-3  and  F-4.  From  the 
matrix,  we  can  see  that  VI  (complexity)  and  V2  (interconnection)  have  strong  correla- 
tion -0.94,  VI  and  V13  (complexity  measure)  have  correlation  -0.91,  V2  and  V14  (inter- 
connection measure)  have  correlation  0.93,  and  V3  (modularity)  and  V16  (modularity 
measure)  only  have  correlation  0.25. 

One  interesting  thing  is  that  some  correlations  verified  our  guesses  obtained  after  we 
reviewed  the  match  between  survey  result  and  the  calculated  measures.  For  example,  for 
complexity,  we  calculated  complexity  measures  for  all  the  objects  and  then  arranged  the 
objects  in  the  order  of  the  complexity  measures,  from  the  lowest  (least  complex)  to  the 
highest  (most  complex).  The  order  is  :  obj-3,  obj-2,  obj-1,  obj-6,  obj-5,  obj-4.  This  can  be 
seen  from  Table  4.1.  The  first  row  of  Table  4.1  shows  this  order  and  the  second  row  lists 
the  corresponding  measures.  In  order  to  check  the  match  between  measures  and  the  sur- 
vey data,  we  also  included  the  column  averages  for  all  the  objects  in  the  third  row  of 
Table  4.1  and  they  are  arranged  in  the  same  order.  These  averages  are  from  the  normal- 
ized data.  The  column  averages  should  have  the  reverse  order  (from  biggest  average  to 
the  smallest)  because  in  the  original  answer,  1  stands  for  'very  complex'  while  6  stands 
for  answer  'very  simple'. 

By  checking  the  consistency  of  the  two  rows,  obj-5  is  the  only  one  that  doesn't  match. 
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Objects 

obj-3 

obj-2 

obj-1 

obj-6 

obj-5 

obj-4 

Measure 

42 

148.8 

194.8 

443.9 

679.2 

1947.8 

Average 

1.17 

0.69 

0.19 

-0.46 

0.19 

-1.79 

Table  4.1  Complexity  evaluations  k  complexity  measures 
We  rechecked  object  5,  compared  it  with  other  objects,  and  checked  how  the  complexity 

measure  was  constructed  (complexity  measure  is  calculated  from  interconnection  measure 
and  the  size  of  the  DFD.  One  component  in  interconnection  measure  is  data  sharing 
degree  DS).  We  thought  that  the  reason  for  that  was  that  the  subjects  underestimated 
or  even  ignored  data  sharing  issues  when  they  judged  the  complexity  of  the  DFD.  The 
object  5  has  very  strong  data  sharing,  four  modules  share  both  the  same  input  and  out- 
put and  the  shared  output  is  linked  back  to  complete  a  data  loop.  This  is  a  fairly  compli- 
cated pattern  of  interconnections.  From  maintenance  point  of  view,  this  structure  will 
also  definitely  increases  the  complexity  of  the  DFD.  However,  visually,  the  object  5 
doesn't  seem  to  show  a  messy  connections  among  modules  that  might  mislead  the  sub- 
jects to  underestimate  the  complexity  of  the  DFD. 

From  the  correlation  matrix  on  F-4,  V2  (interconnection)  and  V22  (data  sharing  measure 
DS)  have  correlation  -0.06.  It  gave  us  the  partial  confidence  about  our  conclusion.  In 
order  to  furthermore  prove  the  conclusion,  we  later  on  carried  out  regression  analysis 
again  to  show  that  there  is  no  evidence  in  the  survey  data  about  the  relationship 
between  interconnection  evaluation  and  the  data  sharing  degree. 

Regression  analysis    : 

Regression  analysis  was  carried  out  among  different  groups  of  variables.  Those  different 
groups  and  their  regression  analysis  results  from  SAS  are  listed  from  page  F-4  of  Appen- 
dix F.  For  the  ease  of  discussion,  some  useful  information  is  extracted  from  there  and  put 
into  Table  4.2. 
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Regression  Results  Summary 

Dep. 

Indep. 

R-squ. 

P 

Vll 

V12 

0.9889 

0.000 

Vll 

V9 

0.7683 

0.023 

Vll 

V13 

0.7205 

0.033 

V9 

vie 

0.0626 

0.634 

V8 

V14 

0.8564 

0.009 

V8 

V17 

0.8070 

0.016 

V8 

V18 

0.8482 

0.010 

V8 

V19 

0.8156 

0.015 

V8 

V22 

0.0037 

0.874 

V8 

V25 

0.7178 

0.034 

V8 

V17-V25 

1.0000 

***** 

V7 

V8 

0.8883 

0.006 

V7 

V15 

0.8716 

0.008 

V7 

V13 

0.8427 

0.011 

V7 

V8.V15 

0.9290 

0.023 

V7 

V10 

0.4454 

0.147 

Table  4.2  Summary  of  Regression  Results 
There  are  15  regressions  listed  and  their  results  will  lead  to  our  conclusions.  Two  statis- 
tics, R-square  and  P,  indicate  the  significance  of  the  regression  result.  The  closer  the 
value  of  R-square  is  to  1.0,  the  larger  the  proportion  of  the  dependent  variable  values 
that  can  be  predicted  by  the  independent  variables  according  to  the  regression  formulas. 
For  example,  if  R-square  is  0.9  and  we  use  the  regression  formula  to  predict  100  points, 
then  90%  of  the  predicted  points  are  reliable  or  trustworthy.  P  is  the  strength  of  the  evi- 
dence that  against  the  hypothesis  :  the  coefficients  of  the  independent  variables  in  the 
regression  formula  are  zeros.  Therefore,  the  smaller  the  P  is,  the  weaker  the  evidence  is 
to  against  the  hypothesis.  Usually,  if  P  is  less  than  0.05,  the  evidence  will  be  considered 
too  weak  to  against  the  hypothesis,  i.e.,  the  coefficients  are  trustworthy. 

Therefore,  the  closer  the  value  of  R-square  is  to  1.0,  the  stronger  the  linear  relationship 
between  the  dependent  variable  and  the  independent  variables  is.  Also,  the  smaller  the  P 
is,  the  more  confidence  we  should  have  to  the  relationship  built  by  the  regression  formula 
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between  the  two  regressed  variables.    According  to  the  results  shown  in  Table  4.2,  we 
can  get  our  conclusions  in  the  next  section. 

4.4.    Statistical  Conclusions 

The  conclusions  will  be  stated  one  by  one  in  the  order  of  the  regressions  listed  in  Table  4.2.  All 
the  conclusions  are  presented  in  a  literal  way  but  not  a  numeric  way  because  regressions  were 
done  based  on  both  numeric  values  and  non-numeric  values  (the  normalized  answers  are  still 
symbols  but  not  numerical  values).  We  will  not  use  the  regression  formulas  as  deterministic 
relationships  among  criteria  but  only  use  them  to  conclude  the  patterns  of  how  they  affect 
each  other.  This  is  valid  because  the  symbolic  system  we  are  using  (the  symbols  representing 
the  evaluative  phrases)  keep  the  correct  order  of  the  criteria  even  though  they  do  not  have  any 
precise  numeric  meanings. 

1)  Ease  of  implementing  a  DFD  is  linearly  affected  by  the  complexity  level  of  the  DFD. 
This  conclusion  is  from  the  first  regression  result  with  R-square=0.9889  and  P=0.000. 

2)  Ease  of  implementing  a  DFD  is  linearly  affected  by  the  modularity  of  the  DFD.  The  con- 
clusion is  from  the  second  regression  result  with  R-square  of  0.7683  and  P=0.023. 

3)  Ease  of  implementing  a  DFD  is  a  linear  function  of  the  complexity  measure  developed  in 
chapter  three.  This  is  concluded  from  the  third  regression  in  Table  4.2  with  R- 
square=0.7205  and  P=0.033. 

4)  The  modularity  measure  developed  in  chapter  three  is  not  a  valid  measure  because  it 
doesn't  have  any  relationship  with  the  modularity  evaluations  from  the  survey.  The 
regression  between  this  measure  and  the  survey  result  shows  this  with  R-square  of  0.0626 
and  P  of  0.634. 

5)  The  interconnection  measure  developed  in  chapter  three  shows  good  prediction  ability. 
This  can  be  conclude  from  the  regression  between  this  measure  and  the  survey  result 
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about  the  corresponding  criterion.  The  R-square  value  is  0.8564  and  P  value  is  0.009. 

6)  The  interconnection  measure  is  constructed  from  five  components,  average  token  burden 
ATB,  average  connection  burden  ACB,  average  path  burden  APB,  data  sharing  degree 
DS,  and  loop  density  LD.  Regression  6  through  regression  10  demonstrated  that  intercon- 
nection criterion  is  linearly  related  with  every  one  of  the  components  except  DS.  Also, 
the  result  of  regression  11  shows  that  the  regression  model  consisted  by  those  five  com- 
ponents can  perfectly  predict  the  interconnection  criterion  (R-square  is  1.0  !).  This  is  a 
surprisingly  good  result  which  indicates  that  the  design  of  interconnection  measure  is  rea- 
sonable. By  looking  at  the  regression  formula  for  regression  11  (see  Appendix  F),  we 
notice  that  the  coefficients  of  the  components  are  different  from  the  one  we  developed  in 
chapter  three.  But  the  formula  from  chapter  three  is  also  significant  enough  to  be  used  as 
a  valid  measure.  This  can  be  seen  from  the  result  of  regression  5  (the  regression  between 
interconnection  evaluation  and  the  interconnection  measures)  with  R-square  value  of 
0.8737  and  P  value  of  0.008. 

The  fact  that  interconnection  evaluation  result  is  not  significantly  related  to  DS  measure 
is  actually  the  conclusion  we  are  expecting.  The  regression  between  V8  -  interconnection 
and  V22  -  DS  measure  shows  no  relationship.  This  can  be  seen  from  the  low  R-square 
value  0.039  and  the  very  high  P  value  0.873  (  see  the  result  of  regression  between  V8  and 
V22).  Therefore,  the  conclusion  that  the  subjects  underestimated  or  ignored  data  sharing 
issue  when  they  evaluated  the  interconnection  has  been  proved  by  both  correlation  and 
regression  analysis.  This  conclusion  is  the  basis  of  the  decision  of  dropping  obj-5  (DFD5) 
when  we  try  to  built  fuzzy  membership  functions  of  criteria  in  chapter  five. 

7)  Complexity  criterion  is  linearly  affected  by  interconnection  criterion.  This  conclusion  is 
from  regression  12.  Regression  12  (V7  -  complexity  VS.  V8  -  interconnection)  shows  high 
R-square  value  0.8882  and  low  P  value  0.006. 
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8)  Complexity  criterion  is  also  linearly  affected  by  the  conceptual  size  of  DFDs.  High  R- 
square  value  0.8737  and  low  P  value  0.008  proved  this. 

9)  Complexity  criterion  is  affected  by  the  linear  combination  of  interconnection  and  the  size 
of  DFDs.  The  high  R-square  value  0.9298  and  the  low  P  value  0.022  verified  that. 

10)  The  complexity  measure  developed  in  chapter  three  is  a  valid  measure.  The  result  from 
regression  14  concluded  this.  The  R-square  value  is  0.8448  and  P  value  is  0.011.  The 
interesting  thing  here  is  that  linearly  combining  interconnection  and  the  size  of  DFD  to 
predict  complexity  degree  seems  better  than  multiplying  them  (this  is  the  formula 
developed  in  chapter  three)  to  predict  the  complexity  degree.  However,  both  are 
significant  enough  to  predict  complexity  in  this  study. 

11)  There  is  no  strong  evidence  about  the  relationship  between  complexity  and  logical  cohe- 
sion criteria.  The  regression  between  V7  (complexity)  and  V10  (logical  cohesion)  shows 
this  by  the  low  R-square  value  0.4454  and  high  P  value  0.147. 

The  above  conclusions  make  the  formalizing  DFD  evaluation  more  possible.  We  can  linearize  a 
DFD  by  constructing  its  TR  form,  calculate  the  valid  measures  directly  from  the  TR  (such  as 
ATB,  APB,  INTER,  and  COM),  and  report  the  degree  of  complexity,  interconnection,  or  ease 
of  the  implementation  of  the  DFD.  The  reported  measures  can  be  used  as  a  guide  in  the 
decomposition  of  DFDs. 
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Chapter  5 
Fuzzy  Classifications  of  Evaluating  Criteria 


As  we  mentioned  before,  many  software  measures  can  not  intuitively  provide  users  a  straight- 
forward feeling  about  the  quality  of  the  measured  object.  For  example,  if  we  say  McCabe's 
measure  is  25,  it  usually  doesn't  say  much  about  the  relative  complexity  level  of  the  measured 
software  code,  especially  to  those  managers  who  are  not  familiar  with  software  measurement 
mechanism.  This  is  why  after  proving  the  validities  of  some  designed  measures  in  chapter  four, 
this  research  tries  to  give  users  a  more  intuitive  evaluation  opinion  like  'fairly  complex',  'very 
easy',  by  applying  fuzzy  set  theory  to  classify  the  different  levels  of  the  criteria. 

This  step  of  the  research  is  to  convert  the  DFDs'  measures  into  the  linguistic  classification 
categories  of  these  measures.  This  is  a  mapping  from  measurement  ranges  to  fuzzy  concepts  in 
a  natural  language,  for  example,  mapping  McCabe's  measure  range  to  a  classification  concept 
'very  complex'  with  a  certain  grade  of  membership.  We  believe  that  the  linguistic  classification 
terms  ('very  complex'  in  this  example)  will  usually  give  users  a  better  feeling  about  the  com- 
plexity level  of  the  measured  object. 

The  tool  we  used  in  this  step  of  our  research  is  fuzzy  set  theory.  We  choose  it  because  1) 
natural  language  concepts  (terms)  are  inherently  fuzzy  [HERSH76],  and  2)  fuzzy  set  theory  has 
been  developed  to  offer  a  formal  treatment  of  vagueness  of  natural  language  concepts. 
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5.1.  Vagueness  in  a  Natural  Language  and  its  Proper  Representation 

The  vagueness  in  a  natural  language  enters  in  the  process  of  mapping  a  linguistic  term  unto  a 
universe.  This  can  be  seen  from  two  aspects,  one  is  the  vagueness  of  the  boundary  and  the 
other  one  is  the  ambiguity.  For  example,  when  we  say  that  a  person  is  tall  we  really  don't 
know  what  is  the  precise  height  (universe)  boundaries  for  the  concept  'tall'  (actually,  there  is 
no  precise  boundary).  Because  of  this,  some  concepts,  such  as  'tall'  and  'fairly  tall',  may  over- 
lap over  the  universe.  For  example,  for  height  1.78  meters,  one  can  describe  it  either  as  'tall' 
or  'fairly  tall'.  This  is  exactly  where  the  ambiguity  comes  from.  Therefore,  the  boundary  of  a 
term  is  never  a  point  but  a  region  where  the  term  gradually  (but  not  sharply  !)  moves  from 
being  applicable  to  being  nonapplicable. 

Linguists  have  empirically  assessed  the  hypothesis  that  natural  language  concepts  (terms)  can 
be  described  more  completely  and  more  precisely  using  the  framework  of  fuzzy  sets  theory.  The 
conclusions  are  positive  [HERSH76].  We  will  introduce  how  to  use  fuzzy  sets  to  represent  the 
vagueness  of  natural  language  concepts  in  the  later  sections. 

5.2.  Basic  Concepts  of  Fuzzy  Set  Theory 

Human  beings  can  understand  and  operate  upon  vague  natural  languages.  Computers,  how- 
ever, are  extremely  rigid  and  precise  information-processing  systems.  This  inherent  rigidity 
severely  limits  a  computer's  ability  to  abstract  and  generalize  fundamental  conceptual  func- 
tions [HERSH76].  Since  1960's,  Zadeh  and  other  engineers  have  developed  quantitative  tech- 
niques for  dealing  with  vagueness  in  complex  systems.  The  techniques  are  based  on  fuzzy  set 
theory,  a  generalization  of  the  traditional  theory  of  sets. 

In  the  following  paragraphs,  basic  concepts  related  to  fuzzy  sets  theory  are  discussed  before  the 
formal  definition  of  fuzzy  sets  is  introduced. 

Linguistic  variables    : 
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The  unique  feature  of  fuzzy  logic  is  that  it  allows  systems  to  contain  both  numeric  and 
linguistic  variables.  Linguistic  variables  are  the  variables  whose  values  can  be  words  or 
sentences.  They  are  defined  as  labels  of  fuzzy  sets.  For  example,  linguistic  variable  com- 
plexity may  take  on  values  of  very-complex,  fairly-complex,  ...,  fairly- simple,  and  very- 
simple. 

Difference  between  fuzzy  sets  and  non-fuzzy  sets  : 

In  traditional  set  theory  (non-fuzzy  set  theory),  a  membership  function  specifies  if  an  ele- 
ment x  is  a  member  of  the  set  X  (truth  value  for  (x  £  X)  is  1,  if  x  is)  or  not  (truth  value 
for  (x  G  X)  is  0,  if  x  is  not).  While  in  fuzzy  set  theory,  the  transition  from  membership  to 
non-membership  is  seldom  a  step  function.  Rather,  there  is  a  gradual  but  specifiable 
change  from  membership  to  non- membership.  That  is,  in  fuzzy  systems,  the  grade  of 
membership  or  the  corresponding  truth  value  of  the  proposition  x  €  X  may  take  any 
value  in  the  closed  real  interval  [0.0,  1.0]. 

Difference  between  fuzzy  membership  and  probability    : 

Fuzziness  is  distinctly  different  from  the  uncertainty  measured  by  the  probability  of  an 
event.  The  probability  indicates  how  big  the  chance  it  will  be  for  an  event  to  occur.  That 
is  to  say,  the  probability  theory  deals  with  the  lack  of  the  knowledge  concerning  an  event 
occurring  in  the  future.  Once  this  knowledge  becomes  available,  the  state  of  affairs  is 
completely  determined.  No  vagueness  is  involved.  One  typical  example  is  a  coin  toss. 
The  uncertainty  of  a  coin  toss  resulting  in  a  head  has  a  certain  probability  associated 
with  it.  Unlike  coin  toss,  no  matter  how  closely  one  measures  or  examines,  a  concept  will 
apply  more  to  some  elements  of  the  universe  than  others.  A  good  example  for  this  is  bald- 
ness. No  matter  how  carefully  we  count  the  number  of  hairs  one  man  has,  this  informa- 
tion can  not  make  the  boundary  between  bald  and  not  bald  free  of  imprecision. 
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Definition  of  fuzzy  sets    : 

Let  X  be  a  universe  of  elements  (persons,  DFDs,  or  heights)  with  a  generic  element  of  X 
denoted  by  x.  A  fuzzy  subset  of  X,  labeled  A,  is  characterized  by  a  membership  function, 
fA,  that  associates  with  each  element  x  in  X  a  real  number,  /^(i),  in  the  closed  interval 
from  0.0  to  1.0,  which  represents  the  grade  of  membership  of  x  in  A.  An  element  of  the 
fuzzy  set  A  thus  can  be  expressed  by  the  ordered  pair  : 

/a(*)/* 

where  /a(x)  is  the  grade  of  membership  of  x  in  A  [ZADEH65].    The  closer  the  value  of 

f  a(x)  is  to  1.0,  the  higher  the  grade  of  membership  of  x  in  A.  If  all  the  grades  are  either 
0  or  1,  then  the  set  becomes  non-fuzzy.  Therefore,  a  traditional  non-fuzzy  set  is  just  a 
special  case  of  fuzzy  sets. 

One  example  of  fuzzy  sets  is  complex  where  the  membership  function  specifies  the  grade 
of  membership  of  complexity  measures  in  the  set  labeled  complex.  Representative  values 
might  be:  f(42)=0.0  f(194.84)=0.3  f(679.2)=0.7  and  f(1947.8)=1.0.  The  fuzzy  set  complex 
looks  like  the  following  : 

complex={0.0/42,  0.3/194.84,  ..,  0.7/679.2,  1.0/1947.8} 

Thus,  a  DFD  whose  value  of  complexity  measure  is  42  clearly  is  not  complex;  A  DFD 
whose  value  of  complexity  measure  is  1947.8  is  clearly  complex;  A  DFD  whose  value  of 
complexity  measure  is  679.2  is  more  complex  than  not  complex. 

Membership  in  a  fuzzy  set  is  specified  by  a  mapping  from  the  universe  to  the  set  in  ques- 
tion. This  mapping  can  be  performed  either  by  enumeration  or  by  a  function.  Whatever 
the  method,  the  result  will  be  that  every  element  in  X  will  have  associated  with  a 
number  corresponding  to  its  grade  of  membership  in  that  fuzzy  set.  Once  this  mapping  is 
specified,  the  set  can  be  used  as  a  linguistic  variable  in  fuzzy  inferences. 
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The  discussion  of  fuzzy  membership  can  be  extended  by  having  the  grade  of  membership 
itself  be  a  fuzzy  set  [HERSH76].  For  example,  if  the  universe  X  is  a  set  of  DFDs  and  A  is 
the  fuzzy  set  complex,  then  : 

complex  =  {high/DFDl,medium/DFD2,...,low/DFDn} 

here,  'high',  'medium',  and  'low'  are  fuzzy  subsets  of  the  universe  of  possible  grade  of 
membership  values.    For  example,  high  can  be  the  following  : 

high={1.0/1.0,0.9/0.98l0.8/0.9,0.5/0.8>..I0.1/0.5} 

A  normal  fuzzy  set  is  a  fuzzy  set  with  at  least  one  element  x  in  X  with  a  grade  of 
membership  1.0. 

Two  fuzzy  sets,  A  and  B,  are  equal  (A=B),  iff  f  a{z)  —  Ib(x)  f°r  all  x  m  X. 

A  fuzzy  set  A  is  contained  in  a  fuzzy  set  B,  or  B  entails  A,  or  A  is  a  subset  of  B,  iff 

/iiOO  <  /bOO  for  all  x  in  X. 

The  union  of  two  fuzzy  sets  in  fuzzy  logic  is  defined  by 

C  =  A  (J  B 
the  union  fuzzy  set  C  has  membership  function  as  following  : 

fc(x)  =  Max  [fA(x),fB(x)],x  EX 
The  intersection  of  two  fuzzy  sets  A  and  B  is  denoted  by 

C  =  A  f)  B 

the  intersection  C  has  the  following  membership  function  : 

i 
fc(x)  =  Min  [fA(x),fB(x)],    xeX 

The  algebraic  product  of  A  and  B  is  denoted  by  AB  and  is  calculated  by 

fM*)  =  fA(*)   */B(x) 
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The  algebraic  sum  of  A  and  B  is  denoted  by  A  +  B  and  is  defined  by 

fA+B(*)  =  fA(*)  +  fB(*) 

if  the  sum  is  less  than  or  equal  to  one 

The  sum  of  fuzzy  sets  is  denned  as  : 

Iaibb(*)  =  /*(*)  +  /b(*)  -  fM*) 
[ZADEH65]. 

Next  section,  we  will  introduce  how  we  apply  fuzzy  set  theory  in  the  research  and  build 
membership  functions  for  the  evaluative  phrases  used  in  our  survey  such  as  very-complex, 
fairly-complex,  or  more-complex-than-simple. 

5.3.    Fuzzy  Sets  Application  in  This  Research 

By  looking  at  Appendix  D  and  Appendix  E,  row  62  through  row  67  are  the  frequencies  of  the 
answers  cross  all  6  DFDs  (objects).  In  fact,  each  row  represents  an  evaluative  phrase.  For 
example,  row  62  on  page  D-2  represents  phrase  very  complex,  row  63  fairly  complex,  row  64 
more  complex  than  simple,  and  so  on.  For  the  normalized  survey  data,  the  corresponding  fre- 
quencies are  shown  on  the  same  rows,  from  row  62  through  row  67  of  Appendix  E.  There  the 
phrases  will  correspond  to  a  subrange  but  not  a  single  value.  Therefore,  row  62  represents  the 
phrase  '  very  complex',  row  63  '  fairly  complex',  and  so  on.  We  will  use  the  normalized  data  to 
apply  fuzzy  set  theory. 

There  are  basically  two  ways  to  build  fuzzy  membership  functions.  They  are  the  statistical 
approach  and  the  linguistic  approach.  Which  one  should  be  used  depends  on  the  nature  of  the 
data.  Statistical  approach  asks  for  the  satisfaction  of  certain  assumptions  and  known  distribu- 
tion (or  at  least,  to  have  the  confidence  to  assume  the  distribution)  of  the  data  [CIVAN86]. 
Linguistic  approach  can  be  used  when  the  data  is  from  a  poll  or  survey.  Harry  M.  Hersh  and 
Alfonso  Caramazza  have  reported  the  results  of  building  fuzzy  membership  functions  from  poll 
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result.  They  conducted  an  experiment  in  Johns  Hopkins  University  to  assess  the  validity  of 
fuzzy  sets  representation  of  natural  language  terms  [HERSH76].  In  our  research,  the  linguistic 
approach  has  been  adopted  because  of  the  nature  of  our  data. 

Since  Zadeh  (1968)  had  equated  the  probability  of  a  fuzzy  event  with  the  expected  grade  of 
membership  of  the  event,  the  proportion  of  yes  responses  for  a  particular  DFD  and  a  particular 
phrase  can  be  interpreted  as  the  grade  of  membership  for  that  DFD  in  the  fuzzy  set  labeled  by 
the  phrase.  If  we  furthermore  normalize  the  fuzzy  set,  we  can  build  up  the  membership  func- 
tions for  all  evaluative  phrases  in  the  following  steps  : 

1)  Choose  the  biggest  number  for  each  row  (phrase)  from  row  62  through  row  67,  divide  all 
the  values  in  this  row  by  the  chosen  number.  The  results  are  shown  from  row  78  through 
row  83  in  Appendix  E 

2)  Rearrange  the  columns  for  the  results  we  get  from  step  (1)  in  the  order  of  the  measures  of 
the  corresponding  criterion  (from  the  smallest  to  the  largest).  For  example,  for  complex- 
ity criterion  (see  E-2),  use  the  complexity  measures  COMs  calculated  based  on  six  DFDs 
to  rearrange  (rank)  the  columns  K  through  column  P  from  the  relatively  simplest  DFD  to 
the  relatively  most  complex  DFD  (the  original  order  of  the  six  DFDs  is  :  DFD1,  DFD2, 
DFD3,  DFD4,  DFD5,  DFD6.  After  rearranging,  the  new  order  is  :  DFD3,  DFD2,  DFD1, 
DFD6,  DFD5,  DFD4).  For  interconnection  criterion  (see  E-4),  using  the  interconnection 
measures  INTERs  calculated  from  six  DFDs  to  rearrange  (rank)  columns  from  the  one 
that  has  the  lowest  interconnection  measure  to  the  one  that  has  the  highest  interconnec- 
tion measure  (the  new  order  is  :  DFD3,  DFD1,  DFD2,  DFD6,  DFD5,  DFD4). 

It  is  valid  to  rearrange  the  DFDs  according  to  the  corresponding  measures  because  we 
have  proved  that  those  measures  are  valid.  That  is,  they  keep  the  correct  order  of  the 
DFDs.  The  result  of  this  step  is  shown  from  row  87  through  row  92. 
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3)  Drop  DFD5  from  the  rearranged  data  obtained  from  step  (2).  That  is,  remove  the  data 
on  column  O  from  the  rearranged  data.  Then,  carry  out  any  necessary  normalization  if 
needed.  The  result  of  this  step  is  displayed  from  row  96  through  row  101.  The  reason  for 
dropping  DFD5  is  stated  in  chapter  four  (see  Conclusion  section  of  chapter  four). 

4)  Plot  fuzzy  membership  functions  based  on  row  96  through  row  101.  The  X-axis  is  our 
universe  (DFDs'  measures)  and  the  Y-axis  is  the  grades  of  the  membership  for  the  ele- 
ments in  universe  in  certain  fuzzy  set. 

In  the  data  from  step  (3),  each  row  now  is  the  fuzzy  membership  function  of  the 
corresponding  evaluative  phrase.  In  addition  to  the  measures  of  the  DFDs,  rows  will  be 
the  fuzzy  sets  of  the  evaluative  phrases.  For  example,  on  page  E-4,  row  96  represents 
phrase  very- complex  and  the  values  from  column  K  to  O  of  this  row  are  the  grades  of 
membership  for  the  DFD's  complexity  measures  in  the  fuzzy  set  very-complex.  The 
corresponding  fuzzy  set  very-complex  can  be  written  as  : 

very-complex  =  {  0.0/42,  0.0/146.76,  0.067/194.8, 
0.067/443.9,  1.0/1947.8  } 

here,  the  universe  is  (42,  146.76,  194.8,  443.9,  1947.8),  a  set  of  complexity  measures  of 
DFDs. 

Applying  above  four  steps  to  the  criteria  that  are  in  question  in  this  research  (they  are  com- 
plexity, interconnection,  and  ease  of  implementation),  We  obtained  figure  5.1,  figure  5.2,  and 
figure  5.3. 

Figure  5.1  is  the  membership  functions  of  complexity.  Figure  5.1  (a)  is  the  membership  func- 
tion of  very-complex  versus  the  membership  function  of  fairly-complex.  Figure  5.1  (b)  is  the 
membership  function  of  more-complex-than-simple  versus  the  membership  function  of  more- 
stmple-than-complex.     Figure   5.1    (c)   is   the   membership  function  of  very-simple  versus   the 
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membership  function  of  fairly  simple.   The  titles  of  the  figures  explained  what  they  are. 

From  these  figures,  we  can  see  that  with  the  addition  of  an  linguistic  intensifier  such  as  very, 
the  graph  of  the  base  word  moves  away  from  the  neutral  point  toward  the  extreme.  This  is 
exactly  the  conclusion  Harry  M.  Hersh  and  Alfonso  Caramazza  successfully  proved  in  their 
empirical  research. 

These  figures  were  produced  based  on  only  five  points.  However,  they  have  really  presented 
reasonable,  consistent,  and  fairly  good  results.  The  figures  outline  basic  shapes  of  the  member- 
ship functions.  In  order  to  get  the  exact  functions,  we  may  need  not  only  get  more  points  but 
also  apply  approximation  theory  to  the  data.  So  far,  we  have  not  covered  these  further  study 
yet  in  this  research.  But  again,  the  conclusions  we  had  till  now  has  provided  a  good  basis  for 
the  further  research  of  this  direction. 

Suppose  we  have  obtained  precise  membership  functions  (either  enumeration  one  or  a  function) 
for  all  the  evaluative  phrases  of  DFDs.  Once  a  certain  DFD's  valid  measure  is  available,  say 
complexity  measure  COM,  we  can  get  its  different  grades  of  membership  for  different  levels  of 
complexity.  For  example,  if  we  have  COM=1012.45,  we  might  get  the  grade  of  membership 
0.75  for  this  DFD  in  fuzzy  set  very-complex,  the  grade  of  membership  0.98  for  this  DFD  in 
fuzzy  set  fairly- complex,  the  grade  of  membership  0.2  for  this  DFD  in  fuzzy  set  fairly- simple, 
and  so  on.  We  can  use  different  approaches  to  combine  them.  The  union  or  the  sum  of  fuzzy 
sets  are  the  examples  of  possible  choices. 

Even  though  the  number  of  objects  used  in  this  research  is  not  enough  to  make  the  curves  of 
membership  functions  smooth,  the  results  have  been  very  interesting,  very  useful  and  they 
have  given  us  the  courage  to  continue  the  further  study  in  this  area. 
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Figure  5.1  (a)    F.M.F.  of  VS  &  FS  for  complexity 
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Figure  5.1  (b)    F.M.F.  of  MC  &  MS  for  complexity 
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Figure  5.1  (c)    F.M.F.  of  VC   &   FC   for  complexity 
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Figure    5.2  (a)    F.M.F.  of  VM  &  FM  for  interconnection 


More    clean    &    More    messy 


9.742  9.917  17.08 

Interconnection     measures 


25.63 


Figure  5.2  (b)    F.M.F.  of  MC  &  MM  for  interconnection 
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Figure   5.2  (c)    F.M.F.  of  VC  &  FC   for  interconnection 
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Figure  5.3  (a)   F.M.F.  of  VH  &  FH  for  ease  of  imp. 
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Figure  5.3  (b)  F.M.F.  of  VE  &  FE  for  ease  of  imp. 
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Figure    5.3  (c)    F.M.F.  of  ME  &  MH  for  ease  of  imp. 


65 


Chapter  6 


Conclusions 


The  following  results  have  been  shown  in  this  research.  They  are  summarized  in  figure  6.1. 
From  a  DFD,  we  can  construct  its  linear  representation  TR  by  applying  the  converting  rules 
stated  in  chapter  two.  After  the  TR  is  created,  various  kind  of  basic  counts  introduced  in 
chapter  three  can  be  calculated  from  it.  The  advanced  DFD  measures  such  as  ATB,  ACB, 
APB,  DS,  LD,  INTER,  and  COM  can  also  be  constructed.  These  measures  will  be  further 
accepted  by  the  fuzzy  classification  mechanism  to  get  the  linguistic  evaluations  of  the  DFD. 
The  criteria  that  currently  can  be  processed  in  the  fuzzy  classification  mechanism  are  intercon- 
nection, complexity,  and  ease  of  implementation  (as  it  shows  in  figure  6.1).  These  classified 
evaluations  can  be  backfed  as  a  guidance  to  DFD  designers  or  even  to  the  process  of  decompo- 
sition of  DFDs.  Once  revised  DFDs  are  created,  they  can  be  evaluated  again.  This  feedback 
can  be  seen  from  the  dotted  part  of  figure  6.1. 
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Based  on  the  previous  chapters,  a  summary  is  given  in  the  following  sections. 

6.1.    Advantages 

Several  advantages  can  be  seen  from  this  research.  They  are  claimed  in  the  following. 

1)  Textual  representation,  a  useful  linearized  form  of  a  DFD  has  built  a  good  foundation  for 
formalizing  the  evaluation  process  of  DFDs. 

2)  The  structure  of  the  TRs  has  made  it  possible  to  automate  the  calculation  of  basic  counts 
needed  in  constructing  advanced  DFD  measures. 

3)  The  DFD  measures  developed  in  chapter  three  of  this  thesis  have  reasonable  specificity 
to  provide  the  information  about  what  is  contributing  to  the  measure.  Therefore,  it  will 
be  able  to  guide  DFD  designers  in  how  to  improve  the  quality  of  the  DFDs.  For  example, 
INTER  is  constructed  by  five  factors  that  reflect  the  different  aspects  of  interconnections 
such  as  average  token  burden  or  data  sharing  degree.  When  the  interconnection  measure 
of  a  DFD  is  too  high,  we  can  find  the  cause  by  checking  the  individual  measures  and 
then  try  to  improve  the  quality. 

4)  DFD  measures  of  certain  criterion  together  with  the  fuzzy  membership  functions  of  these 
measures  provide  a  normativeness  for  the  evaluation  of  DFDs.  As  we  mentioned  before,  if 
a  measure  doesn't  provide  a  norm  against  which  measures  can  be  compared,  it  is  mean- 
ingless to  apply  the  measure  to  the  object  it  measures  in  isolation.  Based  on  the  fuzzy 
membership  functions,  we  can  provide  straightforward  linguistic  judgement  such  as  'very 
complex'.  This  kind  of  terms  have  a  natural  norm  based  on  the  human  beings'  common 
sense  or  intuition. 

5)  This  research  has  explored  another  possible  way  to  evaluate  the  software  design  tool 
DFD. 
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6)  The  scheme  (figure  1.1)  shows  a  closed  environment  of  automatic  DFD  design  tool.  Even 
though  this  research  has  not  touched  the  feedback  part  of  the  scheme,  it  has  provided 
such  a  framework  which  indicates  a  possible  direction  of  future  research. 

6.2.    Problems 

There  are  several  problems  revealed  in  this  research. 

1)  The  results  might  not  be  general  enough  to  validate  the  DFD  evaluation  because  1)  they 
were  obtained  from  a  specific  survey,  and  2)  the  sample  size  in  this  research  is  not  big 
enough  to  support  general  conclusions.  There  are  two  possibilities  related  to  this 
deficiency.  One,  if  we  change  the  group  of  subjects,  we  might  get  different  result,  and 
two,  if  we  use  bigger  sample  size,  we  might  also  get  different  result.  However,  we  think 
that  even  if  the  change  of  the  environment  may  lead  to  the  change  of  the  results,  the 
basic  natures  of  the  conclusions  we  obtained  from  this  research  will  be  the  same. 

2)  Some  attempts  in  this  research  have  failed  such  as  modularity  measure  design,  logical 
cohesion  measure  design.  The  statistical  analysis  results  have  showed  the  failures.  The 
reason  for  this,  we  think,  is  that  these  criteria  themselves  have  not  been  understood  well 
enough  yet  by  both  survey  designers  and  the  subjects.  We  felt  difficulty  in  designing  the 
questions  about  them  at  the  stage  of  preparing  the  data-collection  form  of  the  survey.  It 
means  that  we  ourselves  may  not  have  comprehended  them  clearly.  Also,  by  checking  the 
survey  data,  we  noticed  that  the  distribution  of  the  responses  for  these  criteria  shows 
much  more  spread  than  some  other  one  such  as  interconnection.  The  subjects  seemed  to 
answer  questions  about  these  criteria  based  on  different  understandings  or,  alternatively, 
the  criteria  might  still  be  defined  too  vaguely  to  give  people  a  clear  feeling  about  them. 
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6.3.    Possible  Future  Works 

This  research  has  presented  a  reasonable  scheme  to  do  some  further  study  in  this  field.  The 
possible  trends  can  be  : 

1)  In  order  to  get  more  general  conclusion,  bigger  sample  sizes,  including  both  the  number 
of  subjects  and  the  number  of  objects,  can  be  applied.  It  will  help  not  only  to  increase 
the  confidence  about  the  DFD  measures  but  also  to  approximate  the  fuzzy  membership 
functions  more  reliably  and  more  smoothly. 

2)  Building  up  fuzzy  membership  functions  for  average  token  burden  (ATB),  average  con- 
nection burden  (ACB),  average  path  burden  (APB),  data  sharing  degree  (DS),  and  the 
loop  density  (LD)  will  furthermore  give  DFD  designers  more  intuitive  guidance.  These 
measures  are  the  factors  of  interconnection  criterion  and  interconnection  influences  com- 
plexity and  the  ease  of  implementation.  The  fuzzy  classification  of  them  will  obviously 
provide  the  information  about  how  to  improve  the  compounded  measures. 

3)  Better  understanding  of  other  measures  related  to  the  quality  of  DFDs  will  be  necessary, 
for  example,  modularity  or  even  some  new  aspects  of  DFDs.  It  will  also  depend  on  the 
development  of  the  whole  area  of  software  engineering. 

4)  Based  on  the  previously  mentioned  future  works,  it  is  possible  to  formalize  the  expertise 
of  evaluating  DFDs  and  complete  the  closed  environment  of  automating  the  process  of 
DFD  designs  as  shown  in  figure  1.1. 
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1  7 

7  1 

5 

7  2 

5 

7  3 

5 

7  4 

5 

7  5 

5 

7  6 

5 

7  7 

5 

7  8 

5 

7  9 

9 

8  0 

9 

8  1 

7 

8  2 

7 

8  3 

3 

8  4 

3 

8  5 

3 

8  6 

3 

8  7 

7 

8  8 

7 

8  9 

5 

9  0 

5 

9  1 

5 

9  2 

5 

9  3 

3 

9  4 

3 

9  5 

9  6 

2 

9  7 

4 
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A 

B 

:  o 

E 

F 

G|H|        |J|K|L|M 

N 

1 

Basic  counts  for  DFD-5 

2 

3 

UM 

SM 

LM 

Ul 

CM 

UD 

CO 

CTM 

UIC 

CICi 

CIC 

4 

5 

Measures 

5 

4 

5 

1 

3 

5 

8 

6 

7 

Modules 

1 

8 

8 

2 

3 

9 

3 

3 

1  0 

4 

3 

1  1 

5 

3 

1  2 

Ex  inputs 

1  3 

a 

1 

1  4 

Sum1 

1 

1  5 

Ex  outputs 

a 

1 

1  6 

X 

1 

1  7 

y 

4 

1  8 

Sum2 

6 

1  9 

Interconnects 

M1 

1 

2  0 

M2 

1 

2  1 

M3 

1 

2  2 

M4 

1 

2  3 

X 

4 

2  4 

Sum3 

8 

2  5 

Paths: 

2  6 

a1a 

1 

2  7 

a1x 

2 

2  8 

a1M1    2y 

3 

2  9 

a1M2  3y 

4 

3  0 

a1M3  4y 

5 

3  1 

a1M4  5y 

6 

3  2 

a1M1    2x1M1    2y 

7 

3  3 

a1M1    2x1M2  3y 

8 

3  4 

a1M1    2x1M3   4y 

9 

3  5 

a1M1    2x1M4   5y 

1  0 

3  6 

a1M2  3x1M1    2y 

1  1 

3  7 

a1M2  3x1M2  3y 

1  2 

3  8 

a1M2  3x1M3  4y 

1  3 

3  9 

a1M2  3x1M4  5y 

1  4 

4  0 

a1M3  4x1M1    2y 

1  5 

4  1 

a1M3  4x1M2  3y 

1  6 

4  2 

a1M3   4x1M3   4y 

1  7 

4  3 

a1M3  4x1M4  5y 

1  8 

4  4 

a1M4  5x1M1    2y 

1  9 

4  5 

a1M4  5x1M2  3y 

20 

4  6 

a1M4   5x1M3  4y 

21 
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O 

P 

Q 

R 

S 

T    |  U  |  V  |  W  |    X         Y 

Z 

1 

Basic  counts  for  DFD-5 

2 

3 

UIV 

CIV 

P 

CW 

NPMi 

DPi 

DP 

a 

LLi 

CAW 

CAD 

CSIZE 

4 

5 

9 

1  5 

22 

4 

9 

4 

1.67 

8.14 

36.00 

6 

7 

22 

8 

5 

9 

5 

1  0 

5 

1  1 

5 

1  2 

1  3 

1  4 

1  5 

1  6 

1  7 

1  8 

1  9 

2  0 

2  1 

2  2 

2  3 

2  4 

2  5 

2  6 

'j 

2  7 

[] 

2  8 

2  9 

3  0 

3  1 

3  2 

3  3 

3  4 

3  5 

3  6 

3  7 

3  8 

3  9 

4  0 

4  1 

4  2 

4  3 

4  4 

4  5 

4  6 
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A 

B      C    D 

E 

F 

G 

H 

I 

J 

K 

L 

M 

N 

4  7 

a1M4   5x1  M4   5y 

22 

4  8 

Loops: 

4  9 

x  1  M1  2  x 

1 

5  0 

x  1  M2  3  x 

2 

5  1 

x  1  M3  4  x 

3 

5  2 

x  1  M4  5  x 

4 
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O 

P 

Q 

R 

S 

T 

U 

V 

w 

X 

Y 

Z 

4  7 

9 

4  8 

4  9 

2 

5  0 

2 

5  1 

2 

5  2 

2 
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A                               B      ( 

:  d 

E 

F    |G | H  |    I    | J    |    K 

L 

M 

1 

Basic  counts  for  DFD-6 

2 

3 

UM 

SM 

LM 

Ul 

Cli 

10 

CO 

CTM 

UIC 

CICi 

4 

5 

Measures 

4 

2 

3 

3 

3 

4 

6 

7 

Modules                                            1 

4 

8 

2 

6 

9 

3 

4 

1  0 

4 

2 

1  1 

Ex  inputs 

1  2 

a 

1 

1  3 

b 

1 

1  4 

c 

2 

1  5 

Sum1 

4 

1   6 

Ex  outputs                                x 
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1   7 

y 

1 

1   8 

c 

1 

1   9 

Sum2 

3 

2  0 

Interconnects                           M 1 

2 

2  1 

M2 

1 

2  2 

a 

1 

2  3 

a 

1 

2  4 

Sum3 

5 

2  5 

Paths: 

2  6 

a  1  x                                               1 

2  7 

a  1  a  2  M2  4  c                                 2 

2  8 

a  1  a  2  a  3  y                                    3 

2  9 

b  2  M2  4  c                                      4 

3  0 

b  2  a  3  y                                          5 

3  1 

c  2  M2  4  c                                       6 

3  2 

c  2  a  3  y                                          7 

3  3 

c  3  y                                                 8 

3  4 

a1a2M11a2M2  4c                9 

3  5 

a1a2M11a2a3y                 10 

3  6 

a1a2a3M11a2M2  4c        11 

3  7 

a1a2a3M11a2a3y           12 

3  8 

Loops: 

3  9 

M1  1  a  2  M1                                    1 

4  0 

M1  1  a  2  a  3  M1                             2 
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N 

O 

p 

Q 

R 

s    It  I  u  I  v  I  w  I  x 

Y 

Z 

1 

Basic  counts  for  DFD-6 

2 

3 

cc 

UIV 

CIV 

P 

CW 

NPMi  DPi 

DP 

CL 

LLi 

CAW 

CAD 

CSIZE 

4 

I 

5 

5 

1  0 

1  2 

1  2 

2 

1  3 

2 

1.20 

7.33 

26.00 

6 

7 

7l 

8 

1  0! 

9 

7 

1  0 

5 1 

1  1 

1  2 

1  3 

1  4 

1  5 

1  6 

1  7 

1  8 

1  9 

2  0 

2  1 

2  2 

i 

2  3 

! 

2  4 

2  5 

2  6 

3 

2  7 

!      7 

2  8 

7 

2  9 

5 

3  0 

5 

3  1 

I       5 

3  2 

5 

3  3 

3 

3  4 

1  1 

3  5 

1  1 

3  6 

13 

3  7 

13 

3  8 

3  9 

2 

4  0 

3 
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a 
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w 
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b 
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7 

c 
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3 
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e 
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e 
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4  5 
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1 

X       ^ 

^ 
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nV 
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a 

M2 

b 
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2 

4 

c 

^ 

ni 

w 

w 

a 

3 

c 

y 

^ 
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Appendix  C  -  Data-collection  Form 


The  following  is  the  criteria  listed  in  the  order  the  questions  will  be  asked. 


0.  Complexity 

1.  Consistency 

2.  Interconnections 

3.  Modularity  of  the  expansion 

4.  Cohesion 

5.  Clarity 

6.  Ease  of  implementation 

7.  Complexity 
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Before  each  question  is  asked,  the  definition  of  the  related  certerion  will  be  given.  Please 
read  the  definition  carefully  first  so  that  you  can  answer  the  questions  in  a  correct  way. 


COMPLEXITY 

Complexity  of  the  expansion  is  the  STRUCTURAL  complexity  of  the  diagram  (but  not 
concerning  with  the  psychological  complexity  of  the  diagram).  High  complexity  implies 
that  the  processing  implied  by  the  diagram  is  not  simple. 


1.  Do  you  think  this  expansion  is 

a)  very  complex 

b)  fairly  complex 

c)  more  complex  than  simple 

d)  more  simple  than  complex 

e)  fairly  simple 

f)  very  simple 


exp-1       exp-2       exp-3       exp-4      exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

W 

(c) 

W 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

W 

(f) 

(f) 

(f) 

(0 

(f) 

(f) 

CONSISTENCY 

A  box  and  its  expansion  are  consistent  if  they  both  imply  the  same  process. 


2.  Consider  the  consistency  between  the  process  implied  by  the 
original  box  and  the  process  implied  by  the  expansion, 
these  processes  are 

a)  very  consistent 

b)  fairly  consistent 

c)  more  consistent  than  inconsistent 

d)  more  inconsistent  than  consistent 

e)  fairly  inconsistent 

f)  very  inconsistent 


C-2 


exp-1       exp-2       exp-3       exp-4       exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

8) 

(f) 

(0 

(0 

(0 

(0 

INTERCONNECTIVITY 

Interconnection  depends  on  the  interface  complexity  among  boxes  of  the  expansion.  Low 
interconnnection  means  that  each  box  in  expansion  (module)  should  be  easy  to  develop 
independently,  i.e.,  interconnection  is  a  measure  of  the  relative  independence  among 
modules. 

We  can  view  interconnections  in  two  aspects  : 

1)  The  number  of  interconnections  in  the  graph 

2)  How  are  they  related  together  (  in  a  simple  fashion  or  in  a  complicated  way  ) 

Usually,  the  larger  number  of  interconnections  will  lead  to  a  more  messy-looking  graph 
and  fewer  will  lead  to  a  clearer  one.    But  attention  must  be  paid  to  the  following  fact  : 

Small  number  of  interconnections  CAN  be  related  in  a  complicated  way  (messy)  and  a  lot 
of  interconnections  MIGHT  be  related  in  a  logically  simple  (clean)  fashion. 


3.  Observing  the  interconnections  among  the  modules  in  the  expansion, 
they  are 

a)  very  clean 

b)  fairly  clean 

c)  more  clean  than  messy 

d)  more  messy  than  clean 

e)  fairly  messy 

f)  very  messy 

exp-1       exp-2      exp-3       exp-4       exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(•) 

(0 

(0 

(f) 

(0 

(0 

(0 

C-3 


MODULARITY 

Modularity  is  the  extent  to  which  software  is  composed  of  discrete  components  such  that 
a  change  to  one  component  has  minimal  impact  on  other  components. 


4.  Modularity  of  the  expansion  is 


a)  very  good 

b)  fairly  good 

c)  better  than  in  between 

d)  worse  than  in  between 

d)  fairly  poor 

e)  very  poor 

exp-1 

exp-2 

exp-3 

exp-4 

exp-5 

exp-( 

(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

GO 

(b) 

(b) 

(b) 

(b) 

(c) 

(e) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(f) 

(*) 

(0 

(0 

(f) 

(0 

COHESION 

Cohesion  is  a  measure  of  the  relative  functional  strength  of  a  module.  We  can  view  cohe- 
sion of  the  expansion  from  two  stand  points  : 

1)  for  each  module  in  the  expansion,  consider  the  functional  cohesion,  i.e.,  a 
highly  cohesive  module  should  (ideally)  do  just  one  task.  High  functional 
cohesion  modules  have  singularity  of  tasks. 

2)  for  the  whole  expansion,  consider  the  cohesion  of  the  whole  diagram,i.e.,see 
how  strongly  the  modules  in  the  expansion  are  related  to  each  other  and  to 
the  task  of  the  original  box.  Low  cohesion  diagrams  do  many  unrelated  tasks. 

Note:     the  questions  concerning  with  different  aspects  of  cohesion  will  be  given  seperately. 
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5.  The  average  functional  cohesion  of  modules  in  the  expansion  is 


a)  very  good 

b)  fairly  good 

c)  better  than  in  between 

d)  worse  than  in  between 

e)  fairly  poor 

f)  very  poor 

exp-1 

exp-2 

exp-3 

exp-4 

exp-5 

exp-f 

(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(0 

(0 

(f) 

(f) 

(0 

(0 

6.  The  overall  cohesion  of  the  whole  diagram  is 

a)  very  good 

b)  fairly  good 

c)  better  than  in  between 

d)  worse  than  in  between 

e)  fairly  poor 

f)  very  poor 


exp-1       exp-2       exp-3       exp-4       exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(0 

(f) 

(0 

(0 

(0 

(0 

CLARITY 


Clarity  is  related  to  how  easily  the  expansion  would  be  understood.    It  can  be  viewed 
from  two  aspects  : 


a)       processes 


By  observing  the  expansion  from  the  original  box  into  the  expandede  diagram  ,  does  the 
process  seem  clear  or  not  ? 
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b)       details 

By  observing  the  expanded  diagram  ,  does  it  provide  detail  enough  information  to  reflect 
the  meaning  of  the  original  box  ? 

Note:     The   two   questions   concerning   with   the   two   aspects   of  clarity   will   also  be   given 
seperately. 


7.  Observing  the  way  in  which  the  original  box  has  been  expanded, 
it  is  to  understand  the  PROCESS. 


a)  very  easy 

b)  fairly  easy 

c)  more  easy  than  hard 

d)  more  hard  than  easy 

e)  fairly  hard 

f)  very  hard 

exp-1 

exp-2 

exp-3 

exp-4 

exp-5 

exp-( 

(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(f) 

(f) 

(f) 

W 

(0 

(f) 

,  By  observing  the  detailness  of  the  expansion,  it  is to 

understand  the  original  box. 


a)  very  easy 

b)  fairly  easy 

c)  more  easy  than  hard 

d)  more  hard  than  easy 

e)  fairly  hard 

f)  very  hard 
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exp-1       exp-2       exp-3       exp-4       exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(0 

(0 

ffl 

(0 

(f) 

(f) 

EASE  OF  IMPLEMENTATION 

By  observing  the  overall  structure  of  the  expanded  diagram,  how  easy  would  it  be  to 
implement  it  ? 


9.  Based  on  the  structure  of  the  expanded  diagram,  the 
implementation  of  this  diagram  should  be 

a)  very  easy 

b)  fairly  easy 

c)  more  easy  than  hard 

d)  more  hard  than  easy 

e)  fairly  hard 

f)  very  hard 

exp-1       exp-2       exp-3       exp-4       exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(0 

(0 

(0 

(0 

(0 

(0 

10.  Based  on  the  thoughts  developed  in  answering  above  questions, 
you  think  that  the  expanded  diagram  is 


a)  very  complex 

b)  fairly  complex 

c)  more  complex  than  simple 

d)  more  simple  than  complex 

d)  fairly  simple 

e)  very  simple 
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exp-1       exp-2       exp-3       exp-4       exp-5       exp-6 


(a) 

(a) 

(a) 

(a) 

(a) 

(a) 

(b) 

(b) 

(b) 

(b) 

(b) 

(b) 

(c) 

(c) 

(c) 

(c) 

(c) 

(c) 

(d) 

(d) 

(d) 

(d) 

(d) 

(d) 

(e) 

(e) 

(e) 

(e) 

(e) 

(e) 

(f) 

(f) 

(f) 

(0 

(f) 

(f) 
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A 

B 

C             D 

E 

F 

G             H 

I 

1 

2 

Collected  data  for  evaluating  DFDs  complexity 

3 

4 

Complexity 

expn-1 

expn-2 

expn-3 

expn-4 

expn-5 

expn-6 

Avq 

5 

6 

p1q 

4 

6 

6 

1 

3 

2 

3.67 

7 

P2q 

4 

4 

2 

2 

4 

2 

3.00 

8 

P3q 

3 

5 

5 

2 

3 

4 

3.67 

9 

P4q 

4 

3 

4 

3 

5 

2 

3.50 

1  0 

p5q 

3 

5 

6 

1 

4 

3 

3.67 

1   1 

p7q 

3 

2 

4 

1 

3 

2 

2.50 

1  2 

P8q 

4 

4 

4 

1 

3 

3 

3.17 

1  3 

P9q 

2 

6 

5 

3 

6 

2 

4.00 

1  4 

p10q 

5 

3 

4 

1 

4 

3 

3.33 

1  5 

p11q 

3 

2 

5 

1 

4 

2 

2.83 

1  6 

p12q 

5 

3 

5 

2 

5 

3 

3.83 

1  7 

p13q 

3 

5 

5 

3 

5 

5 

4.33 

1  8 

p14q 

3 

4 

2 

1 

5 

3 

3.00 

1  9 

p15q 

3 

4 

5 

1 

2 

3 

3.00 

2  0 

p16q 

4 

3 

5 

2 

4 

3 

3.50 

2  1 

p17q 

3 

1 

5 

1 

3 

2 

2.50 

2  2 

p18q 

3 

4 

5 

1 

3 

2 

3.00 

2  3 

p19u 

2 

4 

5 

1 

3 

3 

3.00 

2  4 

p20u 

3 

5 

2 

1 

3 

4 

3.00 

2  5 

p21u 

2 

2 

4 

1 

2 

2 

2.17 

2  6 

p22u 

2 

3 

5 

1 

3 

4 

3.00 

2  7 

p23u 

2 

3 

3 

1 

2 

3 

2.33 

2  8 

p24u 

5 

3 

4 

2 

3 

3 

3.33 

2  9 

p25u 

3 

1 

4 

2 

4 

3 

2.83 

3  0 

p26u 

4 

5 

3 

1 

4 

1 

3.00 

3  1 

p27u 

4 

4 

6 

1 

4 

5 

4.00 

3  2 

p28u 

5 

5 

6 

2 

4 

4 

4.33 

3  3 

p29u 

3 

3 

4 

1 

2 

2 

2.50 

3  4 

p30u 

3 

2 

4 

1 

2 

2 

2.33 

3  5 

p31u 

3 

4 

5 

1 

5 

3 

3.50 

3  6 

p32u 

5 

6 

6 

2 

5 

4 

4.67 

3  7 

p33u 

5 

5 

6 

2 

3 

3 

4.00 

3  8 

p34u 

3 

3 

2 

1 

2 

3 

2.33 

3  9 

p35u 

3 

4 

5 

2 

3 

4 

3.50 

4  0 

p36u 

3 

5 

5 

1 

5 

2 

3.50 

4  1 

p37u 

3 

5 

5 

2 

3 

3 

3.50 

4  2 

p38u 

4 

3 

4 

1 

2 

4 

3.00 

4  3 

p39u 

5 

3 

6 

1 

5 

4 

4.00 

4  4 

p40u 

4 

5 

5 

1 

4 

2 

3.50 

4  5 

p41  u 

2 

4 

3 

1 

2 

3 

2.50 

4  6 

p42u 

3 

4 

3 

1 

3 

????? 

2.80 

D-  1 


A 

B 

C 

D 

E 

F 

G 

H 

I 

4  7 

p43u 

3 

5 

5 

2 

4 

2 

3.50 

4  8 

p44u 

4 

3 

5 

1 

3 

2 

3.00 

4  9 

p45u 

3 

3 

2 

1 

2 

2 

2.17 

5  0 

p46u 

5 

5 

6 

3 

3 

5 

4.50 

5  1 

p47u 

4 

5 

5 

1 

2 

2 

3.17 

5  2 

p48u 

3 

5 

5 

5 

3 

3 

4.00 

5  3 

p49u 

5 

6 

5 

2 

4 

3 

4.17 

5  4 

p50u 

4 

5 

5 

1 

4 

2 

3.50 

5  5 

p51u 

1 

5 

6 

1 

4 

5 

3.67 

5  6 

p52u 

5 

5 

4 

2 

5 

2 

3.83 

5  7 

p53u 

5 

6 

3 

2 

4 

1 

3.50 

5  8 

5  9 

6  0 

6  1 

6  2 

Freq.    1 

1 

2 

0 

32 

0 

2 

6  3 

Freq.    2 

6 

4 

5 

1  5 

1  0 

19 

6  4 

Freq.    3 

22 

13 

5 

4 

1  7 

1  8 

6  5 

Freq.   4 

1  2 

1  1 

1  1 

0 

1  5 

8 

6  6 

Freq.    5 

1  1 

1  6 

23 

1 

9 

4 

6  7 

Freq.    6 

0 

6 

8 

0 

1 

0 

6  8 

6  9 

total 

52 

52 

52 

52 

52 

51 

7  0 

7  1 

AVG 

3.50 

4.00 

4.48 

1.52 

3.50 

2.86 

7  2 

S.D. 

1.02 

1.28 

1.18 

0.80 

1.06 

1.00 

7  3 

7  4 

Curve 

Data  : 

7  5 

7  6 

Calculate  fuzzy  membership  qrades  from  the  poll  results 

7  7 

7  8 

0.031 

0.063 

0 

1 

0 

0.063 

7  9 

0.316 

0.211 

0.263 

0.789 

0.526 

1 

8  0 

1 

0.591 

0.227 

0.182 

0.773 

0.818 

8  1 

0.8 

0.733 

0.733 

0 

1 

0.533 

8  2 

0.478 

0.696 

1 

0.043 

0.391 

0.174 

8  3 

0 

0.75 

1 

0 

0.125 

0 

8  4 

8  5 

Rearranqe  the  columns  in  the  order  of  complexir 

/  measures 

8  6 

8  7 

0 

0.063 

0.031 

0.063 

0 

1 

8  8 

0.263 

0.211 

0.316 

1 

0.526 

0.789 

8  9 

0.227 

0.591 

1 

0.818 

0.773 

0.182 

9  0 

0.733 

0.733 

0.8 

0.533 

1 

0 

9  1 

1 

0.696 

0.478 

0.174 

0.391 

0.043 

9  2 

1 

0.75 

0 

0 

0.125 

0 

D-2 


A 

B|      C            D             E             F             G 

H 

I 

9  3 

I                  '           I              I 

9  4 

Remove  the  outlier  DFD5  and  normalize  when  needed 

9  5 

9  6 

0     0.0I 

0.031 

0.063 

1 

9  7 

0.263 

0.211 

0.316 

1 

0.789 

9  8 

0.227 

0.591 

1 

0.818 

0.182 

9  9 

0.917 

0.917 

1 

0.667 

0 

1  0  0 

1 

0.696 

0.478 

0.174 

0.043 

1  0  1 

1 

0.75 

0 

0 

0 

1  0  2 

D-3 


A 

B 

C             D              E              F              G 

H 

1 

I               I                I               I 

2 

Collected  data  for  evaluatinq  DFDs  interconnection 

3 

4 

Interconnction 

expn-1 

expn-2 

expn-3 

expn-4 

expn-5 

expn-6 

Avq 

5 

6 

p1q 

2 

1 

1 

6 

4 

5 

3.17 

7 

P2q 

1 

3 

3 

5 

2 

4 

3.00 

8 

p3q 

2 

3 

1 

4 

3 

3 

2.67 

9 

P4q 

2 

3 

1 

6 

2 

5 

3.17 

1  0 

P5q 

2 

1 

1 

4 

2 

4 

2.33 

1  1 

p7q 

2 

5 

6 

3 

2 

5 

3.83 

1  2 

P8q 

2 

2 

3 

5 

2 

3 

2.83 

1  3 

P9q 

3 

2 

2 

4 

2 

5 

3.00 

1  4 

p10q 

2 

3 

4 

5 

2 

5 

3.50 

1  5 

p11q 

2 

3 

2 

6 

3 

3 

3.17 

1  6 

p12q 

2 

3 

1 

5 

3 

4 

3.00 

1  7 

p13q 

2 

1 

1 

4 

2 

2 

2.00 

1  8 

p14q 

2 

4 

2 

5 

4 

4 

3.50 

1  9 

p15q 

3 

2 

4 

5 

6 

3 

3.83 

2  0 

p16q 

2 

3 

5 

5 

2 

3 

3.33 

2  1 

p17q 

1 

3 

3 

6 

4 

4 

3.50 

2  2 

p18q 

3 

1 

6 

3 

3 

2.83 

2  3 

p19u 

2 

2 

6 

2 

3 

2.67 

2  4 

p20u 

2 

3 

4 

6 

4 

5 

4.00 

2  5 

p21u 

1 

3 

6 

3 

4 

3.00 

2  6 

p22u 

1 

1 

6 

4 

5 

3.00 

2  7 

p23u 

1 

2 

6 

3 

4 

2.83 

2  8 

p24u 

2 

3 

3 

5 

4 

5 

3.67 

2  9 

p25u 

2 

2 

5 

3 

4 

2.83 

3  0 

p26u 

2 

3 

3 

2 

4 

2.50 

3  1 

p27u 

2 

3 

6 

3 

3 

3.00 

3  2 

p28u 

1 

3 

4 

2 

2 

2.17 

3  3 

p29u 

2 

2 

6 

4 

5 

3.33 

3  4 

p30u 

2 

3 

5 

3 

4 

3.00 

3  5 

p31u 

2 

3 

2 

6 

3 

4 

3.33 

3  6 

p32u 

2 

3 

4 

2 

3 

2.50 

3  7 

p33u 

2 

2 

5 

2 

3 

2.50 

3  8 

p34u 

2 

2 

3 

4 

3 

4 

3.00 

3  9 

p35u 

2 

2 

2 

3 

2 

3 

2.33 

4  0 

p36u 

2 

2 

5 

2 

5 

2.83 

4  1 

p37u 

2 

2 

5 

4 

4 

3.00 

4  2 

p38u 

1 

2 

3 

6 

3 

3 

3.00 

4  3 

p39u 

1 

3 

1 

5 

2 

3 

2.50 

4  4 

p40u 

4 

2 

2 

6 

3 

4 

3.50 

4  5 

p41u 

3 

3 

2 

6 

3 

3 

3.33 

4  6 

p42u 

4 

3 

2 

5 

2 

3 

3.17 

D-4 


A 

B 

C             D 

E 

F 

G 

H 

I 

4  7 

p43u 

2 

1 

2 

6 

3 

6 

3.33 

4  8 

p44u 

2 

3 

1 

4 

2 

2 

2.33 

4  9 

p45u 

2 

3 

4 

6 

3 

4 

3.67 

5  0 

p46u 

2 

2 

1 

5 

2 

4 

2.67 

5  1 

p47u 

2 

3 

1 

6 

5 

6 

3.83 

5  2 

p48u 

2 

3 

2 

3 

5 

4 

3.17 

5  3 

p49u 

1 

2 

1 

5 

3 

3 

2. 50 

5  4 

p50u 

2 

3 

1 

4 

2 

3 

2.50 

5  5 

p51  u 

2 

2 

1 

6 

2 

3 

2.67 

5  6 

p52u 

2 

1 

2 

6 

2 

3 

2.67 

5  7 

p53u 

1 

2 

1 

3 

3 

4 

2.33 

5  3 

5  9 

6  0 

6  1 

6  2 

Freq.    1 

1  0 

7 

29 

0 

0 

0 

6  3 

Freq.    2 

36 

1  8 

1  1 

0 

23 

3 

6  4 

Freq.    3 

4 

25 

6 

5 

1  8 

1  9 

6  5 

Freq.   4 

2 

1 

4 

9 

8 

1  8 

6  6 

Freq.    5 

0 

1 

1 

1  7 

2 

1  0 

6  7 

Freq.    6 

0 

0 

1 

21 

1 

2 

6  8 

6  9 

total 

52 

52 

52 

52 

52 

52 

7  0 

7  1 

AM3 

1.96 

2.44 

1.85 

5.04 

2.85 

3.79 

7  2 

S.D. 

0.66 

0.83 

1.21 

0.99 

0.96 

0.96 

7  3 

7  4 

7  5 

7  6 

Calculate  fuzzy  membership  qrades  from  the  poll  results 

7  7 

7  8 

0.345 

0.241 

1 

0 

0 

0 

7  9 

1 

0.5 

0.306 

0 

0.639 

0.083 

8  0 

0.16 

1 

0.24 

0.2 

0.72 

0.76 

8  1 

0.111 

0.056 

0.222 

0.5 

0.444 

1 

8  2 

0 

0.059 

0.059 

1 

0.1  18 

0.588 

8  3 

0 

0 

0.048 

1 

0.048 

0.095 

8  4 

8  5 

Rearrange  the  columns  in  the  order  of  interconnection  measure 

8  6 

8  7 

1 

0.345 

0.241 

0 

0 

0 

8  8 

0.306 

1 

0.5 

0.083 

0.639 

0 

8  9 

0.24 

0.16 

1 

0.76 

0.72 

0.2 

9  0 

0.222 

0.111 

0.056 

1 

0.444 

0.5 

9  1 

0.059 

0 

0.059 

0.588 

0.118 

1 

9  2 

0.048 

o 

0 

0.095 

0.048 

1 

D-5 


J                         K 

L 

M 

N 

O 

P 

4  7 

p43u 

-0.50 

1.50 

1.50 

-1  .50 

0.50 

-1  .50 

4  8 

p44u 

1.00 

0.00 

2.00 

-2.00 

0.00 

-1  .00 

4  9 

p45u 

0.83 

0.83 

-0.17 

-1  .1  7 

-0.1  7 

-0.17 

5  0 

p46u 

0.50 

0.50 

1.50 

-1  .50 

-1  .50 

0.50 

5  1 

p47u 

0.83 

1.83 

1.83 

-2.1  7 

-1  .1  7 

-1  .17 

5  2 

p48u 

-1  .00 

1.00 

1.00 

1.00 

-1  .00 

-1.00 

5  3 

p49u 

0.83 

1.83 

0.83 

-2.17 

-0.1  7 

-1.17 

5  4 

p50u 

0.50 

1.50 

1.50 

-2.50 

0.50 

-1  .50 

5  5 

p51u 

-2.67 

1.33 

2.33 

-2.67 

0.33 

1.33 

5  6 

p52u 

1.17 

1.17 

0.17 

-1  .83 

1.17 

-1  .83 

5  7 

p53u 

1.50 

2.50 

-0.50 

-1  .50 

0.50 

-2.50 

5  8 

5  9 

Ranqe 

Max 

2.5 

Min: 

-  3 

Total: 

5.5 

6  0 

Subrange 

0.917 

6  1 

6  2 

-3.0     -     -2.084 

1 

0 

0 

1  5 

0 

1 

6  3 

-2.084     -     -1.168 

p 

1 

0 

32 

2 

1  1 

6  4 

-1.168     --     -0.252 

1  2 

1  0 

5 

4 

1  5 

20 

6  5 

-0.252    -    0.664 

1  9 

9 

6 

0 

21 

1  1 

6  6 

0.664    -    1.58 

H5 

25 

24 

1 

1  2 

8 

6  7 

1.58    --    2.5 

n 

7 

1  7 

0 

2 

0 

6  8 

6  9 

total 

5  2 

52 

52 

52 

52 

51 

7  0 

7  1 

Avq 

0.10 

0.69 

1.17 

-1  .79 

0.19 

-0.46 

7  2 

S.D. 

0.83 

0.98 

0.90 

0.68 

0.82 

0.91 

7  3 

Max 

1.67 

2.50 

2.50 

1.00 

2.00 

1.33 

7  4 

Min 

-2.67 

-1  .83 

-1  .00 

-3.00 

-1  .50 

-2.50 

7  5 

7  6 

Calculate  fuzzy  membership  qrades  from  the  poll  results 

7  7 

7  8 

0.067 

0 

0 

1 

0 

0.067 

7  9 

0.063 

0.031 

0 

1 

0.063 

0.344 

8  0 

0.6 

0.5 

0.25 

0.2 

0.75 

1 

8  1 

0.90:5 

0.429 

0.286 

0 

1 

0.524 

8  2 

0.64 

1 

0.96 

0.04 

0.48 

0.32 

8  3 

0.1  13 

0.412 

1 

0 

0.118 

0 

8  4 

8  5 

Rearrange  the  columns  in  the  order  of  complexity  measures 

8  6 

8  7 

0 

0 

0.067 

0.067 

0 

1 

8  8 

0 

0.031 

0.063 

0.344 

0.063 

1 

8  9 

0.25 

0.5 

0.6 

1 

0.75 

0.2 

9  0 

0.286 

0.429 

0.905 

0.524 

1 

0 

9  1 

0.96 

1 

0.64 

0.32 

0.48 

0.04 

9  2 

- 

0.412 

0.118 

0 

0.118 

0 

E-  2 


J                          K              L              M              N 

O 

P 

9  3 

9  4 

Remove  the  outlier  DFD5  and  normalize  when  needed 

9  5 

9  6 

0 

0 

0.067 

0.067 

1 

9  7 

0 

0.031 

0.063 

0.344 

1 

9  8 

0.25 

0.5 

0.6 

1 

0.2 

9  9 

0.31 '3 

0.474 

1 

0.579 

0 

1  0  0 

0.96 

1 

0.64 

0.32 

0.04 

1  0  1 

- 

0.4121    0.118 

0 

0 

E-  3 


J 

K 

L 

M 

N 

O 

P 

1 

2 

Normalized  data  for  evaluating  DFDs' 

interconnection 

3 

4 

Interconnection 

expn-1 

expn-2 

expn-3 

expn-4 

expn-5 

expn-6 

5 

6 

p1q 

-1.17 

-2.1  7 

-2.17 

2.83 

0.83 

1  .83 

7 

P2q 

-2.0C 

0.00 

0.00 

2.00 

-1  .00 

1.00 

8 

P3q 

-0.67 

0.33 

-1  .67 

1.33 

0.33 

0.33 

9 

P4q 

-1.17 

-0.1  7 

-2.17 

2.83 

-1  .1  7 

1.83 

1  0 

p5q 

-0.33 

-1  .33 

-1  .33 

1.67 

-0.33 

1  .67 

1  1 

p7q 

-1  .83 

1.17 

2.17 

-0.83 

-1  .83 

1.17 

1  2 

P8q 

-0.83 

-0.83 

0.17 

2.17 

-0.83 

0.17 

1  3 

P9q 

O.OC 

-1  .00 

-1.00 

1.00 

-1  .00 

2.00 

1  4 

p10q 

-1.5C 

-0.50 

0.50 

1.50 

-1  .50 

1.50 

1  5 

p11q 

-1.17 

-0.1  7 

-1.1  7 

2.83 

-0.1  7 

-0.1  7 

1  6 

p12q 

-1.0C 

0.00 

-2.00 

2.00 

0.00 

1.00 

17 

p13q 

O.OC 

-1  .00 

-1  .00 

2.00 

0.00 

0.00 

1  8 

p14q 

-1.5C 

0.50 

-1  .50 

1.50 

0.50 

0.50 

1  9 

p15q 

-0.83 

-1  .83 

0.17 

1.17 

2.17 

-0.83 

2  0 

p16q 

-1.33 

-0.33 

1.67 

1.67 

-1  .33 

-0.33 

2  1 

p17q 

-2.5C 

-0.50 

-0.50 

2.50 

0.50 

0.50 

2  2 

p18q 

0.17 

-1  .83 

-1  .83 

3.17 

0.17 

0.17 

2  3 

p19u 

-0.67 

-0.67 

-1  .67 

3.33 

-0.67 

0.33 

2  4 

p20u 

-2.0C 

-1  .00 

0.00 

2.00 

0.00 

1.00 

2  5 

p21u 

-2.0C 

0.00 

-2.00 

3.00 

0.00 

1  .00 

2  6 

p22u 

-2.0C 

-2.00 

-2.00 

3.00 

1.00 

2.00 

2  7 

p23u 

-1  .83 

-0.83 

-1  .83 

3.17 

0.17 

1.17 

2  8 

p24u 

-1.67 

-0.67 

-0.67 

1.33 

0.33 

1.33 

2  9 

p25u 

-0.83 

-0.83 

-1  .83 

2.17 

0.17 

1.17 

3  0 

p26u 

-0.5C 

0.50 

-1  .50 

0.50 

-0.50 

1.50 

3  1 

p27u 

-1  .OC 

0.00 

-2.00 

3.00 

0.00 

0.00 

3  2 

p28u 

-1.17 

0.83 

-1.17 

1.83 

-0.17 

-0.17 

3  3 

p29u 

-1  .33 

-1  .33 

-2.33 

2.67 

0.67 

1.67 

3  4 

p30u 

-1  .OC 

0.00 

-2.00 

2.00 

0.00 

1  .00 

3  5 

p31u 

-1  .33 

-0.33 

-1  .33 

2.67 

-0.33 

0.67 

3  6 

p32u 

-0.5C 

0.50 

-1  .50 

1.50 

-0.50 

0.50 

3  7 

p33u 

-0.5C 

-0.50 

-1.50 

2.50 

-0.50 

0.50 

3  8 

p34u 

-1.0C 

-1  .00 

0.00 

1.00 

0.00 

1  .00 

3  9 

p35u 

-0.33 

-0.33 

-0.33 

0.67 

-0.33 

0.67 

4  0 

p36u 

-0.83 

-0.83 

-1  .83 

2.17 

-0.83 

2.17 

4  1 

p37u 

-1.0C 

-1  .00 

-2.00 

2.00 

1.00 

1.00 

4  2 

p38u 

-2.0C 

-1  .00 

0.00 

3.00 

0.00 

0.00 

4  3 

p39u 

-1.5C 

0.50 

-1  .50 

2.50 

-0.50 

0.50 

4  4 

p40u 

0.5C 

-1  .50 

-1.50 

2.50 

-0.50 

0.50 

4  5 

p41u 

-0.33 

-0.33 

-1  .33 

2.67 

-0.33 

-0.33 

4  6 

p42u 

0.83 

-0.1  7 

-1  .17 

1.83 

-1  .1  7 

-0.17 

E-4 
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M 
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O 
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4  7 

p43u 

-1  .33 

-2.33 

-1  .33 

2.67 

-0.33 

2.67 

4  8 

p44u 

-0.32 

0.67 

-1  .33 

1.67 

-0.33 

-0.33 

4  9 

p45u 

-1.67 

-0.67 

0.33 

2.33 

-0.67 

0.33 

5  0 

p46u 

-0.67 

-0.67 

-1  .67 

2.33 

-0.67 

1  .33 

5  1 

p47u 

-1.83 

-0.83 

-2.83 

2.17 

1.17 

2.17 

5  2 

p48u 

-1  .17 

-0.1  7 

-1  .17 

-0.1  7 

1.83 

0.83 

5  3 

p49u 

-1  .5C 

-0.50 

-1.50 

2.50 

0.50 

0.50 

5  4 

p50u 

-0.5C 

0.50 

-1  .50 

1.50 

-0.50 

0.50 

5  5 

p51  u 

-0.67 

-0.67 

-1  .67 

3.33 

-0.67 

0.33 

5  6 

p52u 

-0.67 

-1  .67 

-0.67 

3.33 

-0.67 

0.33 

5  7 

o53u 

-1  .33 

-0.33 

-1  .33 

0.67 

0.67 

1.67 

5  8 

5  9 

Ranqe 

Max: 

3.33 

Min: 

-2.83 

Total: 

6.167 

6  0 

Subrange 

1.028 

6  1 

6  2 

-2.83     --     -1.802 

9 

5 

1  4 

0 

1 

0 

6  3 

-1.802     --     -0.774 

25 

1  5 

24 

1 

■    8 

1 

6  4 

-0.774     --     0.254 

1  € 

23 

1  1 

1 

30 

1  1 

6  5 

0.254    -    1.282 

2 

9 

1 

6 

1  1 

26 

6  6 

1.282    -    2.31 

0 

0 

2 

21 

2 

1  3 

6  7 

2.31     --    3.33 

0 

0 

0 

23 

0 

1 

6  8 

6  9 

total 

52 

52 

5  2            5  2 

52 

52 

7  0 

7  1 

Avq 

-1  .02 

-0.54 

-1.14         2.05 

-0.14 

0.80 

7  2 

S.D. 

0.6S 

0.78 

0.98 

0.90 

0.78 

0.77 

7  3 

Max 

0.83 

1.1  7 

2.17 

3.33 

2.17 

2.67 

7  4 

Min 

-2.5C 

-2.33 

-2.83 

-0.83 

-1  .83 

-0.83 

7  5 

7  6 

Calculate  fuzzy  membership  qrades  from  the 

Doll  results 

7  7 

7  8 

0.643 

0.357 

1 

0 

0.071 

0 

7  9 

1 

0.6 

0.96 

0.04 

0.32 

0.04 

8  0 

0.533 

0.767 

0.367 

0.033 

1 

0.367 

8  1 

0.077 

0.346 

0.038 

0.231 

0.423 

1 

8  2 

0 

0 

0.095 

1 

0.095 

0.619 

8  3 

0 

0 

0 

1 

0 

0.043 

8  4 

8  5 

Rearrange  the  columns  in  the  order  of  interconnection  measure 

8  6 

8  7 

1 

0.643 

0.357 

0 

0.071 

0 

8  8 

0.9€ 

1 

0.6 

0.04 

0.32 

0.04 

8  9 

0.367 

0.533 

0.767 

0.367 

1 

0.033 

9  0 

0.036 

0.077 

0.346 

1 

0.423 

0.231 

9  1 

0.09E 

0 

0 

0.619 

0.095 

1 

9  2 

0 

0 

0 

0.043 

0 

1 
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O 

P 

9  3 

9  4 

Remove  the  outlier  DFD5  and  normalize  when  needed 

9  5 

9  6 

1 

0.643 

0.357 

0 

0 

9  7 

0.96 

1 

0.6 

0.04 

0.04 

9  8 

0.476 

0.695 

1 

0.478 

0.043 

9  9 

0.036 

0.077 

0.346 

1 

0.231 

1  0  0 

0.095 

0 

0 

0.619 

1 

1  01 

0 

0 

0 

0.043 

1 
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Normalized  data  for  evaluating  DFDs  modularity 

3 

4 

Modularity 

expn-1 

expn-2 

expn-3 

expn-4 

expn-5 

expn-6 

5 

6 

p1q 

-2.67 

-1.67 

1.33 

1.33 

0.33 

1.33 

7 

p2q 

O.OC 

-1  .00 

1.00 

0.00 

1.00 

-1  .00 

8 

P3q 

0.5C 

-1  .50 

-0.50 

1.50 

-0.50 

0.50 

9 

p4q 

-1  .33 

1.67 

1  .67 

0.67 

-1  .33 

-1  .33 

1  0 

P5q 

-0.67 

-0.67 

-0.67 

1.33 

-0.67 

1.33 

1   1 

p7q 

-1  .32 

0.67 

1  .67 

0.67 

-2.33 

0.67 

1  2 

P8q 

-1  .OC 

0.00 

0.00 

1.00 

-1  .00 

1.00 

1  3 

P9q 

-0.32 

-0.33 

-0.33 

-0.33 

-0.33 

1.67 

1  4 

p10q 

-1  .32 

-2.33 

1  .67 

1.67 

-1  .33 

1.67 

1  5 

p11q 

O.OC 

-1  .00 

1  .00 

0.00 

0.00 

0.00 

1   6 

p12q 

-0.32 

-1  .33 

0.67 

0.67 

-0.33 

0.67 

1   7 

p13q 

0.5C 

-0.50 

-0.50 

1.50|    -0.50 

-0.50 

1   8 

p14q 

-1  .82 

0.17 

0.17 

1.17 

0.17 

0.17 

1  9 

p15q 

-  0  .  5  C 

-2.50 

0.50 

1.50 

1.50 

-0.50 

2  0 

p16q 

O.OC 

-2.00 

1  .00 

1.00 

0.00 

0.00 

2  1 

p17q 

-1  .32 

-1  .33 

-0.33 

2.67 

-0.33 

0.67 

2  2 

p18q 

-0.5C 

0.50 

1  .50 

-0.50 

-0.50 

-0.50 

2  3 

p19u 

-0.32 

-0.33 

-1  .33 

2.67 

-0.33 

-0.33 

2  4 

p20u 

-0.5C 

-1  .50 

1.50 

-0.50 

0.50 

0.50 

2  5 

p21u 

-2.82 

0.17 

1.17 

2.17 

1.17 

-1  .83 

2  6 

p22u 

O.OC 

0.00 

-1  .00 

1.00 

0.00 

0.00 

2  7 

p23u 

-1  .67 

-0.67 

0.33 

1.33 

0.33 

0.33 

2  8 

p24u 

-0.5C 

-1.50 

0.50 

1.50 

0.50 

-0.50 

2  9 

p25u 

-1  .OC 

2.00 

-1  .00 

0.00 

-1  .00 

1.00 

3  0 

p26u 

-0.5C 

-1  .50 

1.50 

0.50 

-1  .50 

1.50 

3  1 

p27u 

-0.67 

-0.67 

-0.67 

1.33 

0.33 

0.33 

3  2 

p28u 

0.5C 

-0.50 

-0.50 

0.50 

-0.50 

0.50 

3  3 

p29u 

-1  ,5C 

0.50 

-0.50 

1.50 

-0.50 

0.50 

3  4 

p30u 

-1  .5C 

-1  .50 

0.50 

1.50 

-0.50 

1.50 

3  5 

p31u 

-0.82 

0.17 

0.17 

1.17 

-0.83 

0.17 

3  6 

p32u 

-0.17 

-1.17 

-0.17 

1.83 

-1.17 

0.83 

3  7 

p33u 

-1  .32 

-0.33 

0.67 

-0.33 

-0.33 

1.67 

3  8 

p34u 

0.32 

-0.67 

0.33 

0.33 

-0.67 

0.33 

3  9 

p35u 

-0.32 

-0.33 

-0.33 

-0.33 

0.67 

0.67 

4  0 

p36u 

-0.5C 

0.50 

1.50 

1.50 

-1  .50 

-1  .50 

4  1 

p37u 

-0.5C 

1.50 

1.50 

-1  .50 

-0.50 

-0.50 

4  2 

p38u 

-0.67 

-0.67 

0.33 

0.33 

0.33 

0.33 

4  3 

p39u 

0.32 

0.33 

-0.67 

0.33 

-0.67 

0.33 

4  4 

p40u 

-0.5C 

-0.50 

0.50 

1.50 

-1.50 

0.50 

4  5 

p41u 

-1.17 

-1.17 

1.83 

0.83 

-0.17 

-0.1  7 

4  6 

p42u 

-0.5C 

0.50 

-0.50 

0.50 

-0.50 

0.50 
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4  7 

p43u 

-1.33 

-2.33 

0.67 

1.67 

-1  .33 

2.67 

4  8 

p44u 

1.0C 

0.00 

2.00 

-2.00 

0.00 

-1  .00 

4  9 

p45u 

-1  .17 

-1.17 

-0.17 

0.83 

0.83 

0.83 

5  0 

p46u 

-0.33 

-0.33 

-0.33 

1.67 

-0.33 

-0.33 

5  1 

p47u 

-1.0C 

-1.00 

0.00 

1.00 

0.00 

1.00 

5  2 

p48u 

-0.33 

-0.33 

-0.33 

0.67 

0.67 

-0.33 

5  3 

p49u 

-0.17 

-1.17 

-1.17 

1.83 

-0.17 

0.83 

5  4 

p50u 

-0.83 

-0.83 

-0.83 

1.17 

0.17 

1.17 

5  5 

p51u 

-0.33 

-0.33 

1.67 

-0.33 

-0.33 

-0.33 

5  6 

p52u 

-0.5C 

-1.50 

2.50 

0.50 

-1  .50 

0.50 

5  7 

p53u 

-2.0C 

-2.00 

1.00 

2.00 

0.00 

1  .00 

5  8 

5  9 

Range 

Max: 

2.667 

Min: 

-2.83 

Total: 

5.5 

6  0 

Subrange 

0.917 

6  1 

6  2 

-2.83     --     -1.914 

3 

5 

0 

1 

1 

0 

6  3 

-1.914     -     -0.998 

1  E 

1  6 

4 

.1 

1  0 

5 

6  4 

-0.998    --    0.082 

24 

1  6 

1  7 

6 

22 

1  0 

6  5 

0.082    --    0.834 

9 

1  2 

1  3 

16 

1  6 

24 

6  6 

0.834    -    1.75 

1 

2 

1  5 

22 

3 

1  2 

6  7 

1.75    -    2.67 

0 

1 

3 

6 

0 

1 

6  8 

6  9 

total 

52 

52 

52 

52 

52 

52 

7  0 

7  1 

Avg 

-o.6e 

-0.61 

0.39 

0.86 

-0.32 

0.36 

7  2 

S.D. 

0.77 

0.98 

0.96 

0.94 

0.77 

0.88 

7  3 

Max 

1.00 

2.00 

2.50 

2.67 

1.50 

2.67 

7  4 

Min 

-2.83 

-2.50 

-1.33 

-2.00 

-2.33 

-1  .83 
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Normalized  data  for  evaluating 

DFDs"  loqical  cohesion 

3 

4 

Loqical  cohesion 

expn-1 

expn-2 

expn-3 

expn-4 

expn-5 

expn-6 

5 

6 

pig 

-0.33 

-1  .33 

0.67 

1.67 

-0.33 

-0.33 

7 

P2q 

-1  .00 

0.00 

1.00 

0.00 

0.00 

0.00 

8 

P3q 

-0.33 

-1  .33 

-0.33 

0.67 

1.67 

-0.33 

9 

p4q 

-0.67 

0.33 

-1  .67 

1.33 

-0.67 

1.33 

1   0 

p5q 

-1  .00 

- 1  .  0  c 

-1  .00 

2.QC 

0.00 

1.00 

1   1 

p7q 

-0.53 

1.50 

1.50 

-0.50 

-1  .50 

-0.50 

1  2 

P8q 

0.50 

-0.50 

-0.50 

0.50 

0.50 

-0.50 

1  3 

P9q 

-0.17 

-1  .1  7 

0.83 

-0.1  7 

-1  .1  7 

1.83 

1  4 

p10q 

-0.33 

-0.33 

1.67 

-0.33 

-1  .33 

0.67 

1   5 

p11q 

0.67 

-1.33 

0.67 

0.67 

-1  .33 

0.67 

1   6 

p12q 

-0.17 

-1  .1  7 

2.83 

-1  .1  7 

-0.1  7 

-0.17 

1  7 

p13q 

0.67 

-0.33 

-0.33 

0.67 

-0.33 

-0.33 

1   8 

p14q 

-0.67 

0.33 

-0.67 

1.33 

-0.67 

0.33 

1   9 

p15q 

-2.00 

-2.00 

0.00 

2.00 

2.00 

0.00 

2  0 

p16q 

-1.17 

-1  .1  7 

1.83 

0.83 

-0.1  7 

-0.17 

2  1 

p17q 

-0.33 

-2.33 

0.67 

1.67 

-1  .33 

1.67 

2  2 

p18q 

1  .17 

-1  .83 

1.17 

0.17 

-1  .83 

1.17 

2  3 

p19u 

-0.17 

-0.1  7 

-1  .1  7 

1.83 

-0.1  7 

-0.17 

2  4 

p20u 

1.67 

-1  .33 

0.67 

-0.33 

-1  .33 

0.67 

2  5 

p21u 

-0.1  7 

-0.1  7 

-1  .1  7 

-0.1  7 

0.83 

0.83 

2  6 

p22u 

0.17 

-0.83 

-0.83 

0.17 

0.17 

1.17 

2  7 

p23u 

-0.50 

-0.50 

0.50 

0.50 

-0.50 

0.50 

2  8 

p24u 

-1.17 

0.83 

-0.1  7 

-0.1  7 

0.83 

-0.1  7 

2  9 

p25u 

0.33 

0.33 

-0.67 

-0.67 

-0.67 

1.33 

3  0 

p26u 

-0.67 

-0.67 

1.33 

0.33 

-0.67 

0.33 

3  1 

p27u 

0.00 

0.00 

0.00 

0.00 

-1  .00 

1.00 

3  2 

p28u 

0.33 

-0.67 

-0.67 

0.33 

0.33 

0.33 

3  3 

p29u 

-0.67 

1.33 

-0.67 

0.33 

-0.67 

0.33 

3  4 

p30u 

-0.33 

-0.33 

-0.33 

0.67 

-0.33 

0.67 

3  5 

p31u 

-0.33 

-0.33 

-0.33 

1.67 

-1  .33 

0.67 

3  6 

p32u 

0.17 

-0.83 

-0.83 

1.17 

0.17 

0.17 

3  7 

p33u 

-0.67 

-0.67 

0.33 

1.33 

-0.67 

0.33 

3  8 

p34u 

0.33 

0.33 

0.33 

•  0.33 

-0.67 

-0.67 

3  9 

p35u 

-0.17 

-0.1  7 

-0.1  7 

-0.1  7 

-0.1  7 

0.83 

4  0 

p36u 

-1  .00 

-1  .00 

1.00 

1.00 

-1  .00 

1.00 

4  1 

p37u 

-0.67 

-0.67 

1.33 

-0.67 

0.33 

0.33 

4  2 

p38u 

-0.83 

-0.83 

0.17 

1.17 

0.17 

0.17 

4  3 

p39u 

-0.67 

0.33 

-0.67 

0.33 

0.33 

0.33 

4  4 

p40u 

0.17 

-0.83 

0.17 

0.17 

0.17 

0.17 

4  5 

p41  u 

-0.83 

-0.83 

0.17 

1.17 

0.17 

0.17 

4  6 

p42u 

1.33 

0.33 

-0.67 

-0.67 

-0.67 

0.33 
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Normalized  data  for  evaluating  DFDs'  ease  of  implementation 

3 

4 

Ease  of  imp. 

expn-1 

oxpn-2 

expn-3 

expn-4 

expn-5 

expn-6 

5 

6 

Plq 

-1  .00 

-1  .00 

-1.00 

2.00 

0.00 

1.00 

7 

P2q 

-1  .67 

0.33 

0.33 

1.33 

0.33 

-0.67 

8 

P3q 

0.00 

1.00 

0.00 

1.00 

-1  .00 

-1  .00 

9 

p4g 

-0.33 

-0.33 

-0.33 

0.67 

-1  .33 

1.67 

1  0 

P5q 

0.00 

-1.00 

-2.00 

3.00 

-1  .00 

1  .00 

1  1 

p7q 

0.17 

0.17 

1.17 

0.17 

-1  .83 

0.17 

1  2 

P8q 

-1  .67 

-1.67 

-0.67 

2.33 

0.33 

1.33 

1  3 

P9q 

-1  .50 

-1.50 

0.50 

1.50 

0.50 

0.50 

1  4 

p10q 

-0.17 

-1.17 

0.83 

0.83 

-1.17 

0.83 

1  5 

p11q 

-0.33 

-1  .33 

0.67 

1.67 

-1  .33 

0.67 

1  6 

p12q 

-1  .17 

0.83 

0.83 

0.83 

-2.17 

0.83 

1  7 

p13q 

0.00 

-1.00 

-1.00 

2.00 

0.00 

0.00 

1  8 

p14q 

0.50 

0.50 

0.50 

-1.50 

-1  .50 

1.50 

1  9 

p15q 

-0.83 

-0.83 

0.17 

1.17 

1.17 

-0.83 

2  0 

p16q 

-1  .1  7 

-2.17 

2.83 

1.83 

-1.17 

-0.17 

2  1 

p17q 

-0.67 

-2.67 

0.33 

2.33 

-0.67 

1  .33 

2  2 

p18q 

-0.33 

-1  .33 

1.67 

0.67 

-1  .33 

0.67 

2  3 

p19u 

0.17 

-0.83 

-0.83 

2.17 

-0.83 

0.17 

2  4 

o20u 

-0.50 

-1.50 

1.50 

1.50 

-0.50 

-0.50 

2  5 

p21u 

-1  .00 

0.00 

-1  .00 

2.00 

0.00 

0.00 

2  6 

p22u 

0.83 

-1  .17 

-2.17 

1.83 

0.83 

-0.1  7 

2  7 

p23u 

-0.50 

-0.50 

-0.50 

1.50 

-0.50 

0.50 

2  8 

p24u 

-1  .67 

1.33 

-0.67 

0.33 

0.33 

0.33 

2  9 

p25u 

-0.33 

1.67 

-1.33 

0.67 

-1  .33 

0.67 

3  0 

p26u 

0.67 

-1.33 

0.67 

-0.33 

-1  .33 

1.67 

3  1 

p27u 

-0.33 

-0.33 

-1.33 

2.67 

-0.33 

-0.33 

3  2 

p28u 

0.50 

-0.50 

-0.50 

1.50 

-0.50 

-0.50 

3  3 

p29u 

0.33 

-0.67 

-0.67 

1.33 

-0.67 

0.33 

3  4 

p30u 

0.00 

-1  .00 

-1  .00 

2.00 

0.00 

0.00 

3  5 

p31u 

-0.50 

-0.50 

-0.50 

1.50 

-0.50 

0.50 

3  6 

p32u 

-0.1  7 

-1.17 

-1  .17 

1.83 

-0.17 

0.83 

3  7 

p33u 

0.17 

-0.83 

-1  .83 

2.17 

-0.83 

1.17 

3  8 

p34u 

0.00 

-1.00 

1.00 

1.00 

????? 

-1  .00 

3  9 

p35u 

-0.83 

-0.83 

-0.83 

1.17 

1.17 

0.17 

4  0 

p36u 

0.83 

-1.17 

0.83 

1.83 

-2.17 

-0.17 

4  1 

p37u 

-1  .33 

-1  .33 

-1  .33 

1.67 

1.67 

0.67 

4  2 

p38u 

-0.67 

-0.67 

-0.67 

1.33 

0.33 

0.33 

4  3 

p39u 

-0.1  7 

0.83 

-0.17 

-0.17 

-0.17 

-0.17 

4  4 

p40u 

0.67 

-1.33 

-1.33 

1.67 

-0.33 

0.67 

4  5 

p41u 

-0.50 

-1  .50 

-0.50 

1.50 

0.50 

0.50 

4  6 

p42u 

1.67 

-0.33 

-0.33 

0.67 

-1  .33 

-0.33 

4  7 

p43u 

-0.50 

-1  .50 

-0.50 

1.50 

-0.50 

1.50 

4  8 

p44u 

1.33 

-0.67 

-0.67 

0.33 

-0.67 

0.33 

4  9 

p45u 

-0.33 

-1  .33 

0.67 

1.67 

-0.33 

-0.33 

5  0 

p46u 

-0.50 

-0.50 

-0.50 

2.50 

-0.50 

-0.50 

E-  11 


J 

K 

L 

M 

N 

O 

P 

5  1 

p47u 

-0.33 

-1  .33 

-0.33 

2.67 

-0.33 

-0.33 

5  2 

p48u 

-0.33 

-1.33 

-0.33 

0.67 

0.67 

0.67 

5  3 

p49u 

-0.83 

-0.83 

-0.83 

2.17 

-0.83 

1  .17 

5  4 

p50u 

-1  .17 

-1.17 

-1.17 

2.83 

-1  .1  7 

1.83 

5  5 

p51  u 

-1  .1  7 

-0.17 

-0.17 

0.83 

0.83 

-0.17 

5  6 

p52u 

-0.83 

-1  .83 

2.17 

1.17 

-1  .83 

1.17 

5  7 

p53u 

-2.67 

-2.67 

2.33 

1.33 

-0.67 

2.33 

5  8 

5  9 

Ranqe 

Max: 

3 

Min: 

-2.67 

Total: 

5.667 

6  0 

Subranqe 

0.944 

6  1 

6  2 

-2.67     •-     -1.725 

1 

4 

3 

0 

4 

0 

6  3 

-1.725     --     -0.78 

1  5 

28 

1  3 

1 

1  5 

3 

6  4 

-0.78     --     0.165 

27 

1  3 

1  9 

3 

20 

1  9 

6  5 

0.165    -     1.11 

7 

5 

1  1 

1  2 

9 

1  9 

6  6 

1.11     --    2.055 

2 

2 

3 

26 

3 

1  0 

6  7 

2.055    -    3.00 

0 

0 

3 

1  0 

0 

1 

6  8 

6  9 

total 

52 

52 

52 

52 

51 

52 

7  0 

7  1 

Avq 

-0.39 

-0.79 

-0.18 

1  .40 

-0.47 

0.42 

7  2 

S.D. 

0.80 

0.90 

1.09       0.85 

0.87 

0.77 

7  3 

Max 

1.67 

1.67 

2.83I      3.00 

1.67 

2.33 

7  4 

Min 

-2.67 

-2.67 

-2.17 

-1.50 

-2.17 

-1  .00 

7  5 

7  6 

Calculate  the  fuzzy  membership  grades  from  the  poll  results 

7  7 

7  8 

0.25 

1 

0.75 

0 

1 

0 

7  9 

0.536 

1 

0.464 

0.036 

0.536 

0.107 

8  0 

1 

0.481 

0.704 

0.1  1  1 

0.741 

0.704 

8  1 

0.368 

0.263 

0.579 

0.632 

0.474 

1 

8  2 

0.077 

0.077 

0.1151              1 

0.115 

0.385 

8  3 

0 

0 

0.3            1 

0 

0.1 

8  4 

I 

8  5 

Rearrange  the  columns  in  the  order  of  complexity  measures 

8  6 

8  7 

0.75 

1 

0.25 

0 

1 

0 

8  8 

0.464 

1 

0.536 

0.107 

0.536 

0.036 

8  9 

0.704 

0.481 

1 

0.704 

0.741 

0.1  1  1 

9  0 

0.579 

0.263 

0.368 

1 

0.474 

0.632 

9  1 

0.115 

0.077 

0.077 

0.385 

0.1  15 

1 

9  2 

0.3 

0 

0 

0.1 

0 

1 

9  3 

9  4 

Remove  the  outlier  DFD5  and  normalize  when  needed 

9  5 

9  6 

0.75 

1 

0.25 

o 

0 

9  7 

0.464 

1 

0.536 

0.107 

0.036 

9  8 

0.704 

0.481 

1 

0.704 

0.1  1  1 

9  9 

0.579 

0.263 

0.368 

1 

0.632 

1  0  0 

0.115 

0.077 

0.077 

0.385 

1 

1  0  1 

0.3 

0 

0 

0.1 

1 
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Correlation  Among  Variables 


File:  try. 2 
Include  all  cases 


size:  6  *  25  MISS=  -9999, 


VARIABLE 


MEAN 

STD.  DEV. 

-.002 

1.033 

0.000 

1.226 

.000 

.626 

0.000 

.440 

-.002 

.796 

-.002 

.749 

576.093 

710.340 

14.538 

7.286 

30.000 

24.585 

2.654 

.787 

3.438 

.977 

.905 

.352 

3.515 

3.465 

1.596 

.317 

1.453 

.339 

3.049 

.637 

2.250 

1.405 

1.403 

.455 

3.632 

2.892 

V7 
V8 
V9 
VI 0 
VI 1 
VI 2 
V13 
VI 4 
VI 5 
VI 6 
VI 7 
VI 8 
VI 9 
V20 
V21 
V22 
V23 
V24 
V25 


CORRELATION  MATRIX 


V7   V8   V9   V10  VI 1  V12  VI 3  VI 4  VI 5  VI 6  V17   VI 8  VI 9  V20 


V7| 

1.00  - 

-.94 

-.58 

-.67 

-.88 

.89 

-.92  - 

-.91 

-.93  - 

-.89  - 

-.85 

-.76 

_ 

.93 

-.36 

V8| 

-.94  1.00 

.70 

.68 

.88 

-.88 

.90 

.93 

.90 

.86 

.90 

.92 

.90 

.19 

V9| 

-.58 

.70 

1.00 

.92 

.88 

-.86 

.63 

.52 

.56 

.25 

.37 

.72 

.57 

-.55 

V10| 

-.67 

.68 

.92 

1.00 

.93 

-.90 

.61 

.49 

.56 

.29 

.33 

.60 

.56 

-.41 

Vll| 

-.88 

.88 

.88 

.93 

1.00 

-.99 

.85 

.75 

.82 

.59 

.61 

.75 

.82 

-.12 

V12| 

.89  - 

-.88 

-.86 

-.90 

-.99 

1.00 

-.89  - 

-.77 

-.86  - 

-.61  - 

-.63 

-.74 

- 

.86 

.08 

V13I 

-.92 

.90 

.63 

.61 

.85 

-.89 

1.00 

.92 

.99 

.80 

.81 

.74 

1 

.00 

.25 

V14| 

-.91 

.93 

.52 

.49 

.75 

-.77 

.92  1.00 

.94 

.93 

.96 

.86 

.94 

.39 

V15| 

-.93 

.90 

.56 

.56 

.82 

-.86 

.99 

.94 

1.00 

.85 

.84 

.73 

1 

.00 

.35 

V16I 

-.89 

.86 

.25 

.29 

.59 

-.61 

.80 

.93 

.85  1.00 

.97 

.75 

.85 

.65 

V17| 

-.85 

.90 

.37 

.33 

.61 

-.63 

.81 

.96 

.84 

.97  1.00 

.87 

.85 

.51 

V18| 

-.76 

.92 

.72 

.60 

.75 

-.74 

.74 

.86 

.73 

.75 

.87 

1.00 

.75 

.02 

V19| 

-.93 

.90 

.57 

.56 

.82 

-.86 

1.00 

.94 

1.00 

.85 

.85 

.75 

1 

.00 

.34 

V20| 

-.36 

.19 

-.55 

-.41 

-.12 

.08 

.25 

.39 

.35 

.65 

.51 

.02 

.34 

1.00 

V21| 

.08  ■ 

-.29 

-.85 

-.70 

-.51 

.46 

-.15  - 

-.06 

-.05 

.22 

.05 

-.45 

- 

.07 

.88 

V22| 

-.13  ■ 

-.06 

-.72 

-.58 

-.33 

.29 

.05 

.17 

.15 

.44 

.28 

-.23 

.13 

.97 

V23| 

-.84 

.91 

.65 

.61 

.77 

-.78 

.80 

.94 

.81 

.83 

.91 

.95 

.82 

.17 

V24| 

-.57 

.66 

.37 

.29 

.43 

-.46 

.57 

.82 

.60 

.72 

.81 

.80 

.61 

.25 

V25| 

-.77 

.85 

.57 

.52 

.67 

-.69 

.74 

.93 

.76 

.81 

.90 

.92 

.77 

.20 

V21  V22  V23  V24  V25 


V7 
V8 
V9 


I  .08  -.13  -.84  -.57  -.77 
I  -.29  -.06  .91  .66  .85 
I  -.85  -.72   .65   .37   .57 


F-3 


V10 

-.70  - 

-.58 

.61 

.29 

.52 

Vll 

-.51  - 

-.33 

.77 

.43 

.67 

V12 

.46 

.29 

-.78  - 

-.46  - 

.69 

V13 

-.15 

.05 

.80 

.57 

.74 

V14 

-.06 

.17 

.94 

.82 

.93 

VI 5 

-.05 

.15 

.81 

.60 

.76 

V16 

.22 

.44 

.83 

.72 

.81 

VI 7 

.05 

.28 

.91 

.81 

.90 

V18 

-.45  - 

-.23 

.95 

.80 

.92 

VI 9 

-.07 

.13 

.82 

.61 

.77 

V20 

.88 

.97 

.17 

.25 

.20 

V21 

1.00 

.97 

-.29  - 

-.11  - 

.23 

V22 

.97  1.00 

-.07 

.07  - 

.02 

V23 

-.29  - 

-.07 

1.00 

.89 

.99 

V24 

-.11 

.07 

.89  1.00 

.95 

V25 

-.23  - 

-.02 

.99 

.95  1 

.00 

Number  of  cases :   6 
Number  of  missing  cases 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  Vll 

INDEPENDENT  VARIABLES:  V12 

MULTIPLE  CORRELATION:    .9944  F(   1,     4)  = 

0.000 

R-square:   .9889 


355.354 


P  = 


BETA  for  V12 
=0.000 

INTERCEPT  = 


9944189   B  =    -1.0569654   t(   4)  =  -18.851 


-.0034283 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


3.131 

3.13 

1 

.035 

.01 

4 

3.166 

355.35 


0.000 


MULTIPLE  REGRESSION  RESULTS 


DEPENDENT  VARIABLE:  Vll 


F-4 


INDEPENDENT  VARIABLES:  V9 

MULTIPLE  CORRELATION:    .8766 
.023 

R-square:   .7683 


F(   1,     4)  =    13.267 


P  = 


BETA  for  V9 
.023 

INTERCEPT  = 


8765532   B  =    1.1145439   t(   4)  =   3.642     p  = 


-.0016667 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


2.432 

2.43 

1 

.733 

.18 

4 

3.166 

13.27 


023 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  VI 1 

INDEPENDENT  VARIABLES:  V13 

MULTIPLE  CORRELATION:    .8488         F(   1,     4)  = 
.033 

R-square:   .7205 


10.309 


P  = 


BETA  for  V13 
.033 

INTERCEPT  = 


8487989   B  = 


-.5494314 


,0009508   t(   4)  =   3.211     p  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


2.281 

2.28 

1 

.885 

.22 

4 

3.166 

10.31 


.033 


MULTIPLE  REGRESSION  RESULTS 


F-5 


DEPENDENT  VARIABLE:  V9 

INDEPENDENT  VARIABLES:  V16 

MULTIPLE  CORRELATION:    .2501 
.634 

R-square:   .0626 


F(   1,     4)  = 


.267 


P  = 


BETA  for  VI 6 
.634 

INTERCEPT  = 


2501416   B  = 


-.5283063 


1990354   t(   4)  =     .517     p  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


.123 

.12 

1 

1.836 

.46 

4 

1.958 

.27 


634 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V8 

INDEPENDENT  VARIABLES:  VI 4 

MULTIPLE  CORRELATION:    .9254  F(   1,     4)  = 

.009 

R-square:   .8564 


23.846 


P  = 


BETA  for  VI 4 
.009 

INTERCEPT  = 


.9253942   B  =      .1556988   t(   4)  =   4.883     p  = 


-2.2636012 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


6.435 

6.43 

1 

1.079 

.27 

4 

7.514 

23.85 


.009 


F-6 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V8 

INDEPENDENT  VARIABLES:  VI 7 

MULTIPLE  CORRELATION:    .8983  F(   1,     4)  = 

.016 

R-square:   .8070 


16.721 


P  = 


BETA  for  VI 7 
.016 

INTERCEPT  = 


8983080   B  =    1.1268505   t(   4) 


-3.8739241 


4.089 


P  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


6.064 

6.06 

1 

1.451 

.36 

4 

7.514 

16.72 


016 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V8 

INDEPENDENT  VARIABLES:  VI 8 

MULTIPLE  CORRELATION:    .9210  F(   1,     4)  = 

.010 

R-square:   .8482 


22.347 


P  = 


BETA  for  VI 8 
.010 

INTERCEPT  = 


9209673   B  =    3.2101341   t(   4)  =   4.727     p  = 


-2.9046363 


Analysis  of  variance 

SS  MS       df 


REGRESSION 
RESIDUAL 


6.373 

1.141 


6.37 
.29 


22.35 


.010 


F-7 


TOTAL 


7.514 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V8 

INDEPENDENT  VARIABLES:  V19 

MULTIPLE  CORRELATION:    .9031         F(   1,     4)  = 
.015 

R-square:   .8156 


17.689 


P  = 


BETA  for  VI 9 
.015 

INTERCEPT  = 


.9030920   B  = 


-1.1230151 


.3194922   t(   4)  =   4.206 


P  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


6.128 
1.386 
7.514 


6.13 
.35 


17.69 


015 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V8 

INDEPENDENT  VARIABLES:  V22 

MULTIPLE  CORRELATION:    .0612         F(   1,     4)  = 
.874 

R-square:   .0037 


015 


P  = 


BETA  for  V22 
.874 

INTERCEPT  = 


=   -.0612277   B  =    -.1179179   t(   4)  =   -.123 


3594728 


P  = 


Analysis  of  variance 

SS  MS       df 


F-8 


REGRESSION 

RESIDUAL 

TOTAL 


.028 

.03 

1 

7.486 

1.87 

4 

7.514 

.02 


874 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V8 

INDEPENDENT  VARIABLES:  V25 

MULTIPLE  CORRELATION:    .8472         F (   1,     4)  = 
.034 

R-square:   .7178 


10.175 


P  = 


BETA  for  V25 
.034 

INTERCEPT  = 


,8472442   B 


-1.3044070 


.3591264   t(   4)  =   3.190 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


5.394 

5.39 

1 

2.120 

.53 

4 

7.514 

10.18 


034 


MULTIPLE  REGRESSION  RESULTS 
DEPENDENT  VARIABLE:  V8 
INDEPENDENT  VARIABLES:  V17       V18 


MULTIPLE  CORRELATION:   1.0000 
R-square:  1.0000 


VI 9 


F(   5, 


V22 


0)  = 


V25 
.000 


BETA  for  VI 7 
BETA  for  VI 8 
BETA  for  VI 9 
BETA  for  V22 


-45.4355772  B  = 

40.6040409  B  « 

6.9138418  B  = 

20.9756645  B  = 


-56.9950411 

t( 

0) 

= 

.000 

p 

141.5299048 

t( 

0) 

= 

.000 

p 

2.4459509 

t( 

0) 

= 

.000 

p 

40.3968890 

t( 

0) 

= 

.000 

p 

F-9 


BETA  for  V25 
INTERCEPT  = 


=   -.6366466   B  =    -.2698592   t(   0) 
-62.8887842 


.000 


*  Warning:   Multiple  R  is  equal  to  1.0. 

*  Significance  of  R  and  Beta  cannot  be 


calculated 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V7 

INDEPENDENT  VARIABLES:  V8 

MULTIPLE  CORRELATION:    .9425         F(   1,     4)  = 
.006 

R-square:   .8883 


31.804 


P  = 


BETA  for  V8 
.006 

INTERCEPT  = 


=   -.9424870   B  = 


-.0016667 


-.7939900   t{   4)  =   -5.640 


P  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


4.737 

4.74 

1 

.596 

.15 

4 

5.333 

31.80 


006 


MULTIPLE  REGRESSION  RESULTS 
DEPENDENT  VARIABLE:  V7 
INDEPENDENT  VARIABLES:  V15 


9336 


MULTIPLE  CORRELATION: 
.008 

R-square:   .8716 


F(   1, 


4)  = 


27.157 


P  = 


BETA  for  V15 
.008 


-   -.9336054   B  = 


.0392191   t(   4)  =   -5.211 


F-  10 


INTERCEPT  = 


1.1749051 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


4.648 

4.65 

1 

.685 

.17 

4 

5.333 

27.16 


008 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V7 

INDEPENDENT  VARIABLES:  V13 

MULTIPLE  CORRELATION:    .9180         F(   1,     4)  = 
.011 

R-square:   .8427 


21.427 


P  = 


BETA  for  VI 3 
.011 

INTERCEPT  = 


-.9179804   B  = 


,7672092 


.0013346  t(   4)  =  -4.629 


P  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


4.494 

4.49 

1 

.839 

.21 

4 

5.333 

21.43 


011 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V7 

INDEPENDENT  VARIABLES:  V8        V15 

MULTIPLE  CORRELATION:    .9639         F(   2,     3)  = 
.023 

R-square:   .9290 


19.635 


F-  11 


BETA  for  V8 
.217 

BETA  for  VI 5 
.281 

INTERCEPT  = 


5373151  B  =    -.4526564  t (  3)  =  -1.558 
4526687  B  =    -.0190158  t(   3)  -  -1.312 

5688068 


P  = 
P  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


4.954 

.378 

5.333 


2.48 
.13 


19.63 


023 


MULTIPLE  REGRESSION  RESULTS 

DEPENDENT  VARIABLE:  V7 

INDEPENDENT  VARIABLES:  VLO 

MULTIPLE  CORRELATION:    .6674         F(   1,     4)  = 
.147 

R-square:   .4454 


3.212 


P  = 


BETA  for  V10 
.147 

INTERCEPT  = 


=   -.6673713   B  =   -1.5652981   t(   4)  =   -1.792 


-.0016667 


P  = 


Analysis  of  variance 

SS  MS       df 


REGRESSION 

RESIDUAL 

TOTAL 


2.375 

2.38 

1 

2.958 

.74 

4 

5.333 

3.21 


147 


F-  12 


'Example  one  : 


a 

1 

Ttl 

b 

c 

*~ 

2 

m 

d 

3 

n3 

1 

4 

m 

Two  possible  TJl  representations 


1 )    starts  with  module,  '  1 ' 


1                 Z                                 2 
'aVb'  — >  (m  — >  <[M3],  Cc'.U' > 

4  <~       5 

(m  — >  (n2,  ra  — >  "x"  ))))) 

Atomic  1/0  relationships  : 


'a\'b'  — >  ni(l) 

nid)  — >n3 
'c'.'ct'  — >  ni(2) 
m(2)  —  >tt2 
n3  — >  x 

tt2  ~>  x 


(Ml(l)  —  them  from  module  '1') 
CMK2)  —  the  Ml  from  module  2) 


<Accordina  to  tne  above  atomic  1/0  relationships,  an  1/0 
matrix  can  be  buiXt  as  shown  on  next  page 


G-  1 


m 
(D 

m 

(2) 

M2 

nz 

X 

a 

X 

b 

X 

c 

X 

d 

X 

(D 

X 

m 

(2) 

X 

rvz 

X 

M3 

X 

2)   starts  with  moduie,  '2' 

2  4  1 

'cYd' >  (Ml >  ([n2],(,a,,,b' > 

<--       5 
(M3,  M2  --->  "x"  ))))) 

Atomic  1/0  relationships  : 


'c'.'d'  -->  ttl(2)  , 

ttl(2) 

— >  n2 

'a'.'b'  — >  rtl(l)  , 

nid) 

— >  ra 

tt2,  nz  — >  x 

According  to  tfte  above,  atomis  T/0  relationships,  we 
will  get  exactly  tfie  same  1/0  matrix  as  the  one  shown 

obove,. 


G-  2 


ExampCe  two  : 


r\A 


ns 


Two  possible  T!R  representations  : 


1 )   starts  with  module  '  1 ' 


1                                     3  < —     2 

'a',n4 >([M0], ('£>', M5 >  [n3]  |  (m.MO  — > 

n4*  ii  "x"  ))  i  (n2,ni  — >m*  n  "u"  ) 


Atomic  I/O  relationships 


->  no  |  M2 
->  ra  i  m 


'a',n4  - 

•b.ns  - 
m.no  - 
n2,n3  — >  ns 


->  n4  i  "%" 


Two  Coops  are  involved,  : 

n4  -  i  -  no  -  2  -  n4 
ns  -  3  -  n3  -  4  -  m 


G-  3 


2)   starts  ivtth  module  '2'  : 


3  1  < —      4 

'b'.m  — >  am>],  Ca'.ra  — >  [no]  i  (n2,n3  — > 

<--      2 

ns*   ii  "y"  ))  i  (ni.no  — >  n4*  n  "x"  ) 


Atomic  1/0  relations  flips  : 

"b'.ns  — >  n3  i  m 
'a',n4  -->  nc  i  n2 
ni.no  — >  n4  i  "x" 
n2,n3  -->  m  i  "y" 

Two  Coops  are  involved,  : 

n4  -  i  -  no  -  2  -  n4 
ns-3-n3-4-n5 

*4JX  the  atomic  1/0  relationships  arret  the  Coop  paths 
are  exactly  the  same  as  the  one  tve  developed  from   the 
first  TR  representation.  Therefore,  they  correspond  to 
the  same  Df  D . 
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Abstract 


Data  Flow  Diagrams  are  important  tools  in  software  design.  They  can  not  only  express  the 
data  flow  within  the  software  system  but  also  partially  show  the  control  structure  which  is  a 
critical  factor.  Therefore,  evaluating  data  flow  diagram  is  as  important  as  other  evaluations  of 
software  design.  This  is  why  some  of  the  measures  of  software  complexity  were  developed  based 
on  the  data  flow  diagram,  e.g.,  Henry-  Kafura's.  But  one  problem  in  calculating  the  measures 
is  that  one  needs  to  calculate  measures  based  on  graphs  which  is  not  direct  and  which  can  lead 
to  mistakes.  This  has  motivated  some  researchers  to  try  to  develop  better  representations  for 
data  flow  diagrams  and  put  effort  into  making  them  easier  to  be  formalized.  One  example  is 
Adler's  representation.  There  still  exist  problems  in  these  representations.  For  example,  it  is 
not  general  enough  to  express  different  kinds  of  graphs.  Thus,  it  is  hard  to  use  it  as  the  base  of 
calculating  measures.  Furthermore,  a  representation  should  reflect  the  characteristics  of  the 
whole  graph  so  that  useful  measures  can  be  derived  directly  from  it  and  the  evaluation  of  the 
data  flow  diagrams  can  be  automated  by  storing  knowledge  of  the  representation  and  evalua- 
tion into  an  expert  system. 

The  motivation  of  this  thesis  is  to  develop  such  an  representation  by  extending  Adler's.  This 
representation  should  be  useful  as  the  basis  of  software  measures  for  whole  data  flow  diagrams. 
Another  aim  of  this  research  is  to  acquire  the  knowledge  of  evaluating  DFD  by  comparing  the 
resultant  measures  and  the  evaluation  of  the  DFDs  by  experts  and  to  build  classification 
categories  with  the  help  of  membership  functions  of  fuzzy  set  theory.  This  will  provide  a  good 
basis  for  automatic  evaluation  of  DFDs. 

The  thesis  will  cover  the  following  : 

1)        Introduce  the  representation  of  DFD  and  discuss  how  it  can  reflect  the  characteristics  of 
DFDs. 


2)  Develop  and  describe  how  to  calculate  measures  directly  from  the  representation  and 
demonstrate  it  based  on  sample  DFDs. 

3)  Compare  the  calculated  measures  with  the  expert's  opinion  and  build  up  the  membership 
functions  for  classification  categories  of  the  evalution  of  DFDs. 


